The exhibition aims at an academic
exchange of knowledge and ideas between researchers triggered by appealing
demonstrations of multimedia technology. Each selected demonstration (see
below) will be entitled to have 2 pages in the final proceedings and the CD of
the conference.
Date and cost
Demonstrations of multimedia technology
take place on July 7, the second day of the conference (cost 500 Euro per
table). A booth selling books can be set up for the full duration of the
conference (cost 2100 Euro per booth).
o
April 30, 2005: deadline for
submission of camera-ready full versions of the papers (see the submission page for more details).
o
July 7, 2005: demonstration
day
Selected demonstrations
CATCH:
Continuous Access to Cultural Heritage
9.
Linking the 8 o’clock
news with context information
10. Automatic extraction of brushstrokes from
paintings
11. Cultural Heritage
Information Personalization
MultimediaN Concert-video Browser
Authors:
Ynze van Houten1,
Umut Naci2, Bauke Freiburg3, Robbert Eggermont2,
Sander Schuurman3, Danny Hollander3, Jaap Reitsma1,
Maurice Markslag1, Justin Kniest3, Mettina Veenstra1,
Alan Hanjalic2
Affiliations, email & websites:
1Telematica
Instituut
P.O.Box 589, 7500 AN Enschede, The Netherlands
{Ynze.vanHouten, Jaap.Reitsma, Maurice.Markslag,
Mettina.Veenstra}@telin.nl
2Delft University of Technology, Department of Mediamatics, Information and Communication Theory Group
Mekelweg 4, 2628 CD Delft, The Netherlands
{U.Naci, R.Eggermont,
A.Hanjalic}@EWI.TUDelft.nl
www-ict.ewi.tudelft.nl
3Stichting
Fabchannel
Weteringschans 6-8,
1017 SG Amsterdam, The Netherlands
Abstract:
The
MultimediaN concert-video browser demonstrates a video interaction environment for
efficiently browsing video registrations of pop concerts as performed at the
Dutch concert halls Paradiso and Melkweg, available at the Fabchannel website
(www.fabchannel.com). The exhibition displays the current state of the project
for developing an advanced concert-video browser in 2007. Three demos are
provided: The first demo shows a high-level parsing algorithm for
automatically detecting boundaries of semantically coherent temporal segments
in concert videos, and for automatically generating input to the browser for
non-linear access to different segments of a music video. The second demo shows
a general-purpose video editor and browser. Results from the parsing
algorithm are used to segment concert videos, and semantic descriptions (attributes)
are associated with the segments using the editor. The video browser applies
ideas from information foraging theory, and demonstrates patch-based video
browsing, defining patches as collections of segments sharing the same
attribute. The last demo displays the current design of the Fabplayer, an application for viewing concert
videos including images of a 360-degree camera. Additionally, a mock-up of the
new design of the Fabplayer will be displayed, which will make use of automatic
concert-video segmentation and will support patch-based video browsing.
Real-time and Distributed AV Content Analysis System for Consumer
Electronics Networks
Authors & Affiliations:
Jan Nesvadba, Philips Research Eindhoven, The Netherlands
Pedro Fonseca, Philips Research Eindhoven, The Netherlands
Dzevdet Burazerovic, Philips Research Eindhoven, The Netherlands
Martijn Thijssen, Philips Research Eindhoven, The Netherlands
Patrick van Kaam, Philips Research Eindhoven, The Netherlands
Alexander Sinitsyn, Philips Research Eindhoven, The Netherlands
Marc Peters, Philips Research Eindhoven, The Netherlands
Harry Broers, Philips CFT Eindhoven, The Netherlands
Bart Kroon, Philips Research Eindhoven & TU Delft, The Netherlands
Hasan Celik, Philips Research Eindhoven & TU Delft, The Netherlands
Alan Hanjalic, TU Delft, The Netherlands
Umut Naci, TU Delft, The Netherlands
Robbert Eggermont, TU Delft, The Netherlands
Johan Lukkien, TU Eindhoven, The Netherlands
Andrei Korostelev, Philips Research & TU Eindhoven, The Netherlands
Jan Ypma, Philips Research & TU Eindhoven, The Netherlands
Peter de With, TU Eindhoven, The Netherlands
Jungong Han, TU Eindhoven, The Netherlands
Website:
http://www.research.philips.com/technologies/storage/cassandra/
http://www.extra.research.philips.com/euprojects/candela/
Abstract:
Philips Research and its project partners (within projects MultimediaN and Candela) together demonstrate a real-time audiovisual content analysis system consisting of Service Units (i.e. networked software modules), visualization interfaces, and data management components.
The system includes advanced analysis components such as audio
classification, music analysis, automatic speech recognition, audiovisual scene
segmentation, audiovisual genre classification, and face detection. Furthermore,
a smart system- and connection management coordinates the components scattered
across various processing platforms. The system management contains functions
such as system monitoring, auto-healing, auto-recovery and self-configuration,
which are functions required for reliable in-home network solutions. The
visitor will be able to witness the real-time audiovisual content analysis
results on a wall of LCD displays. Additionally, system- and connection
management functions - enabling a stable function of the distributed content
analysis system - are visualized in a separate user interface. In addition, a
data management component for persistent storage and retrieval of audiovisual
features and an MPEG-7 conversion tool are demonstrated. In parallel, a camera
captures the exhibition floor to demonstrate the achievable real-time face
detection results. Next to that, an automatic tennis sport analysis unit
detects and follows the positions of court and players, and uses a 3-D camera
model to analyze playing behavior in the real world.
Members and
affiliation:
P. Merkus, Bosch Security Systems,
The Netherlands
E. Jaspers, Bosch Security
Systems, The Netherlands
R. Wijnhoven, Bosch Security
Systems, Eindhoven, The Netherlands
R. Albers, Bosch Security Systems,
Eindhoven, The Netherlands
J.-F. Delaigle, Multitel, Mons,
Belgium
X. Desurmont, Multitel, Mons,
Belgium
B. Lienard, Multitel, Mons,
Belgium
J. Hamaide, Multitel, Mons,
Belgium
M. Barais, Vrije Universiteit Brussels, Brussels, Belgium
P. Pietarila, VTT, Oulu, Finland
J. Palo, Solid, Oulu, Finland
Website:
http://www.extra.research.philips.com/euprojects/candela/
Abstract:
Although many different types of technologies for information
systems have evolved over the last decades (such as databases, video systems,
the Internet and mobile telecommunication), the integration of these
technologies is just in its infancy and has the potential to introduce
"intelligent" systems. To unleash the full potential of such
integration, video-content-analysis techniques are being applied for several
applications. This demonstrator represents the surveillance applications. A
live camera(s) will capture the exhibition audience and will apply real-time
video content analysis. Moving objects are being tracked and classified. This
analysis information, including properties such as size, speed, behavior, etc.
are being stored into a relational database. Simultaneously, a remote client
demonstrates on-the-fly retrieval of the lively recorded content by high semantic
search queries. For example, a single search can show the video parts with
abandoned object or loitering persons. It can also retrieve e.g. the vehicles
that followed a specific trajectory by drawing a trajectory curve on a video
picture. Moreover, the distributed database technology is demonstrated by
alarming a guard at any location, streaming the associated camera signal to his
mobile device. This demonstrator shows the novelties of VCA for surveillance
applications.
Members and affiliation:
P. Pietarila, VTT Electronics, Oulu, Finland
S. Järvinen, VTT Electronics, Oulu, Finland
Jari Korva,VTT Electronics, Oulu, Finland
Janne
Lahti, VTT Electronics, Oulu, Finland
Henri
Löthman,VTT Electronics, Oulu, Finland
Utz
Westermann, VTT Electronics, Oulu, Finland
Jorma
Palo, Solid Information Technology, Oulu, Finland
Jarkko
Nisula, Hantro Products, Oulu, Finland
R. Wijnhoven, Bosch Security Systems, Eindhoven,
The Netherlands
Website:
http://www.extra.research.philips.com/euprojects/candela/
Abstract:
Integrated cameras and color displays are making mobile phones increasingly attractive not just for video consumption but also for video production, in particular in the domain of home video. Current research in home video management has mostly regarded mobile devices such as mobile phones as additional access channels for video consumption. The Personal Mobile Multimedia Management demonstrator implements an end-to-end system for personal video production, retrieval and consumption utilizing mobile devices and distributed databases. The demonstrator enables users to perform queries to the home video database compiled by our research team and retrieve interesting segments of the videos based on the annotated metadata. The videos can be queried from and streamed to different devices with the user interfaces, content and video streams scaled accordingly. Exhibition visitors are able to use a mobile terminal to capture video clips, annotate them, insert them into the distributed database and then search, retrieve and view them. Additionally, the system is used to handle real-time data from an adjacent surveillance system, to demonstrate the use of mobile terminals for e.g. security guards to review video snippets of suspicious events.
Object
Recognition by a Robot Dog Connected to a Wide-Area Grid System
Members
and affiliation:
J.M. Geusebroek and F.J. Seinstra
Kruislaan 403, 1098 SJ Amsterdam, The Netherlands
{mark,
fjseins}@science.uva.nl
Abstract:
We will demonstrate object recognition
performed by a Sony Aibo robot dog. The dog is connected to a wide-area Grid
system consisting of hundreds of computers located at several institutes in
Europe. Object recognition is obtained by matching local histograms of color
invariant features against a learned database. We effectively decompose object
appearance recognition into a view based (learned) part and an appearance
(invariant) part. Invariance deals with lighting conditions, color constancy,
and robustness against shading effects and cast shadows. A learned set of
object views guarantees recognition of different aspects of the object. As
such, we show state-of-the-art in object recognition in images, as well as
state-of-the-art in multimedia Grid computing, merged together into a single
application."
Jörg Baldzer
Multimedia and Internet Information Services
OFFIS
Oldenburg, Germany
balder@offis.de
Sabine Thieme
Multimedia and Internet Information Services
OFFIS
Oldenburg, Germany
thieme@offis.de
Niels Rosenhäger
Institute for Communications Technology
Technical University of Brunswick
Brunswick, Germany
n.rosenhaeger@tu-bs.de
Susanne Boll
Department of Computing Science
University of Oldenburg
Oldenburg, Germany
boll@informatik.uni-oldenburg.de
Hans-Jürgen Appelrath
Department of Computing Science
University of Oldenburg
Oldenburg, Germany
appelrath@informatik.uni-oldenburg.de
Website:
http://www.niccimon.de/nightscenelive
Abstract:
The increased mobility and access to information provided by today’s mobile devices are both scientifically and commercially interesting topics. In this respect, the emerging standard for Digital Video Broadcasting - Handheld (DVB-H) not only enables TV services to be displayed on cellular phones but broadband data transmission as well. The combination of cellular communication networks like UMTS or GPRS with such a DVB-H broadcast network produces a hybrid network with enormous potential for mobile multimedia applications. Our prototype Night Scene Live is an application that demonstrates the potential and special features of the hybrid network and exemplifies the prospect for such applications. Night Scene Live, designed for young party-goers, is an excellent example of an application exploiting the potential of future hybrid networks. While DVB-H and the needed standard for IP data broadcast (IP Datacast) is under development, we developed a simulation environment and architecture using DVB-T as broadcast network. With Night Scene Live, videos from parties can be broadcasted to party-goers; attracting them to such events and helping them stay informed with what’s going on where. In addition, a web portal provides further information about the events. Night Scene Live is a project of Niccimon, the Lower Saxony Competence Center for Information Systems for Mobile Usage, Germany.
Members:
Jean-Pierre Schober
(jps@tzi.de) and Dr. Thorsten Hermes (hermes@tzi.de) University of Bremen,
Center for Computing Technologies (TZI), Digital Media
Website:
Abstract:
One of the main challenges for an image retrieval system is to
provide an efficient way of accessing these images on the syntactic and
semantic level.
The image retrieval system PictureFinder, developed
at Center for Computing Technologies (TZI) of the University of Bremen,
provides powerful tools for managing large amounts of images. A strong
similarity search based on user-sketched or arbitrary images
allows to find quickly the desired images. OntoPic, the
content-based image retrieval module, allows automatic annotation of images
based on description logics. The therefore needed domain specific
knowledge is stored in an ontology.
Our demonstration allows a deep look into the techniques used
by the PictureFinder system and the different steps needed to use OntoPic in a
specific domain. These are the ontology design, training, and the effects of
these steps on the further use of this ontology in a specific domain.
MediaMill:
Searching Multimedia Archives Based on Learned Semantics
Authors:
C.G.M. Snoek, D.C. Koelma, J.
van Rest, N. Schipper, F.J. Seinstra, A. Thean, and M. Worring
Affiliation:
MediaMill,
Kruislaan 403, 1098 SJ Amsterdam
Website:
Abstract:
Video is about to conquer the Internet. Real-time delivery of video content is technically possible to any desktop and mobile device, even with modest connections. The main problem hampering massive (re)usage of video content today is the lack of effective content based tools that provide semantic search on multimedia archives. In this demonstrator we show two systems that facilitate semantic search on a 200 hours news archive based on a lexicon of 32 learned concepts. One system exploits user interaction to retrieve segments of interest. The other system facilitates personalized video search over the Internet based on user profiles. The demonstrator shows two effective usage scenarios for semantic search on multimedia archives.
CATCH:
Continuous Access to Cultural Heritage
Contributors
and Affiliations:
Antal van den Bosch, Tilburg University, The
Netherlands, Antal.vdnBosch@uvt.nl
Guus Schreiber, Free University Amsterdam, The
Netherlands, schreiber@cs.vu.nl
Frank van Harmelen, Free University Amsterdam,
The Netherlands, Frank.van.Harmelen@cs.vu.nl
Lambert Schomaker, University of Groningen,
The Netherlands, schomaker@ai.rug.nl
Eric Postma, Universiteit Maastricht, The Netherlands, postma@cs.unimaas.nl
Jaap van den Herik, Universiteit Maastricht, The
Netherlands, herik@cs.unimaas.nl
Paul de Bra, Eindhoven Technical University, The Netherlands, debra@win.tue.nl
Mettina Veenstra, Telematica Instituut, The
Netherlands, Mettina.Veenstra@telin.nl
Abstract:
At the ICME Conference, CATCH will give three presentations, within the three CATCH research themes:
IMIX – Interactive Multimodal Information
EXtraction
Members
and Affiliations:
Gosse Bouma, Rijksuniversiteit Groningen,
The Netherlands
Walter Daelemans, Universiteit van
Tilburg, The Netherlands
Emiel Krahmer, Universiteit van Tilburg,
The Netherlands
Marriët Theune, Universiteit Twente, The
Netherlands
Lou Boves, Radboud Universiteit Nijmegen, The Netherlands
Maarten de Rijke, Universiteit van
Amsterdam, The Netherlands
Harry Bunt, Universiteit van Tilburg, The
Netherlands
Rieks op den Akker, Universiteit Twente, The Netherlands
Website:
Abstract:
IMIX is an NWO (Netherlands Organisation for Scientific Research) research programme in the field of Dutch language and speech technology. IMIX combines research on Question-Answering, Dialogue Management, and Multimodal Interaction. An important aim of IMIX is to investigate whether a multimodal interactive dialogue with a user will lead to better answers in Question-Answering systems. The results of the research in IMIX will be integrated in a common demonstrator that implements a multimodal interactive Question-Answering system in a restricted, medical information domain. One disorder, RSI (Repetitive Strain Injury) was chosen as test environment to investigate the contribution that multimodal input and output can make to the quality of the answers. The first version of the demonstrator integrates initial versions of most modules. With typed input users can ask questions about all types of medical information. For questions related to RSI, users may use either text input or speech input. The same holds for output: general medical information is returned in the form of written text, while information on RSI can be provided in a mix of text, speech, tables and pictures. Using the first version of the demonstrator we will illustrate the way in which interaction can improve the quality of the answers.