-
-
Cees G. M. Snoek, Koen E. A. van de Sande, Amirhossein Habibian, Svetlana
Kordumova, Zhenyang Li, Masoud Mazloom, Silvia-Laura Pintea, Ran Tao,
Dennis C. Koelma, and Arnold W. M. Smeulders.
The MediaMill TRECVID 2012 semantic video search engine,
November 2012.
[ bib |
www: ]
-
-
Cees G. M. Snoek, Koen E. A. van de Sande, Xirong Li, Masoud Mazloom, Yu-Gang
Jiang, Dennis C. Koelma, and Arnold W. M. Smeulders.
The MediaMill TRECVID 2011 semantic video search engine,
December 2011.
[ bib |
.pdf ]
In this paper we describe our TRECVID 2011 video retrieval
experiments. The MediaMill team participated in two
tasks: semantic indexing and multimedia event detection.
The starting point for the MediaMill detection approach is
our top-performing bag-of-words system of TRECVID 2010,
which uses multiple color SIFT descriptors, sparse codebooks
with spatial pyramids, and kernel-based machine learning.
All supported by GPU-optimized algorithms, approximated
histogram intersection kernels, and multi-frame video processing.
This year our experiments focus on 1) the soft assignment
of descriptors with the use of difference coding,
2) the exploration of bag-of-words for event detection, and
3) the selection of informative concepts out of 1,346 concept
detectors as a representation for event detection. The
2011 edition of the TRECVID benchmark has again been
a fruitful participation for the MediaMill team, resulting in
the runner-up ranking for concept detection in the semantic
indexing task.
-
-
Koen E. A. van de Sande and Cees G. M. Snoek.
The University of Amsterdam's concept detection system at
ImageCLEF 2011, September 2011.
[ bib |
www: ]
-
-
Cees G. M. Snoek, Koen E. A. van de Sande, Ork de Rooij, Bouke Huurnink,
Efstratios Gavves, Daan Odijk, Maarten de Rijke, Theo Gevers, Marcel Worring,
Dennis C. Koelma, and Arnold W. M. Smeulders.
The MediaMill TRECVID 2010 semantic video search engine,
November 2010.
[ bib |
www: ]
-
-
Cees G. M. Snoek, Koen E. A. van de Sande, Ork de Rooij, Bouke Huurnink, Jasper
R. R. Uijlings, Michiel van Liempt, Miguel Bugalho, Isabel Trancoso, Fei Yan,
Muhammad A. Tahir, Krystian Mikolajczyk, Josef Kittler, Maarten de Rijke,
Jan-Mark Geusebroek, Theo Gevers, Marcel Worring, Dennis C. Koelma, and
Arnold W. M. Smeulders.
The MediaMill TRECVID 2009 semantic video search engine,
November 2009.
[ bib |
.pdf ]
In this paper we describe our TRECVID 2009 video retrieval
experiments. The MediaMill team participated in three tasks:
concept detection, automatic search, and interactive search. The
starting point for the MediaMill concept detection approach is our
top-performing bag-of-words system of last year, which uses multiple
color descriptors, codebooks with soft-assignment, and kernel-based
supervised learning. We improve upon this baseline system by
exploring two novel research directions. Firstly, we study a
multi-modal extension by including 20 audio concepts and fusion
using two novel multi-kernel supervised learning methods. Secondly,
with the help of recently proposed algorithmic refinements of
bag-of-word representations, a GPU implementation, and compute
clusters, we scale-up the amount of visual information analyzed by
an order of magnitude, to a total of 1,000,000 i-frames. Our
experiments evaluate the merit of these new components, ultimately
leading to 64 robust concept detectors for video retrieval. For
retrieval, a robust but limited set of concept detectors justifies
the need to rely on as many auxiliary information channels as
possible. For automatic search we therefore explore how we can
learn to rank various information channels simultaneously to
maximize video search results for a given topic. To further improve
the video retrieval results, our interactive search experiments
investigate the roles of visualizing preview results for a certain
browse-dimension and relevance feedback mechanisms that learn to
solve complex search topics by analysis from user browsing behavior.
The 2009 edition of the TRECVID benchmark has again been a fruitful
participation for the MediaMill team, resulting in the top ranking
for both concept detection and interactive search. Again a lot has
been learned during this year's TRECVID campaign; we highlight the
most important lessons at the end of this paper.
-
-
Cees G. M. Snoek, Koen E. A. van de Sande, Ork de Rooij, Bouke Huurnink, Jan C.
van Gemert, Jasper R. R. Uijlings, J. He, Xirong Li, Ivo Everts, Vladimir
Nedović, Michiel van Liempt, Richard van Balen, Fei Yan, Muhammad A. Tahir,
Krystian Mikolajczyk, Josef Kittler, Maarten de Rijke, Jan-Mark Geusebroek,
Theo Gevers, Marcel Worring, Arnold W. M. Smeulders, and Dennis C. Koelma.
The MediaMill TRECVID 2008 semantic video search engine,
November 2008.
[ bib |
.pdf ]
In this paper we describe our TRECVID 2008 video retrieval
experiments. The MediaMill team participated in three tasks: concept
detection, automatic search, and interactive search. Rather than
continuing to increase the number of concept detectors available for
retrieval, our TRECVID 2008 experiments focus on increasing the
robustness of a small set of detectors using a bag-of-words
approach. To that end, our concept detection experiments emphasize
in particular the role of visual sampling, the value of color
invariant features, the influence of codebook construction, and the
effectiveness of kernel-based learning parameters. For retrieval, a
robust but limited set of concept detectors necessitates the need to
rely on as many auxiliary information channels as possible.
Therefore, our automatic search experiments focus on predicting
which information channel to trust given a certain topic, leading to
a novel framework for predictive video retrieval. To improve the
video retrieval results further, our interactive search experiments
investigate the roles of visualizing preview results for a certain
browse-dimension and active learning mechanisms that learn to solve
complex search topics by analysis from user browsing behavior. The
2008 edition of the TRECVID benchmark has been the most successful
MediaMill participation to date, resulting in the top ranking for
both concept detection and interactive search, and a runner-up
ranking for automatic retrieval. Again a lot has been learned during
this year's TRECVID campaign; we highlight the most important
lessons at the end of this paper.
-
-
Cees G. M. Snoek, I. Everts, Jan C. van Gemert, Jan-Mark Geusebroek, Bouke
Huurnink, Dennis C. Koelma, Michiel van Liempt, Ork de Rooij, Koen E. A.
van de Sande, Arnold W. M. Smeulders, Jasper R. R. Uijlings, and Marcel
Worring.
The MediaMill TRECVID 2007 semantic video search engine,
November 2007.
[ bib |
.pdf ]
In this paper we describe our TRECVID 2007 experiments. The
MediaMill team participated in two tasks: concept detection and
search. For concept detection we extract region-based image
features, on grid, keypoint, and segmentation level, which we
combine with various supervised learners. In addition, we explore
the utility of temporal image features. A late fusion approach of
all region-based analysis methods using geometric mean was our most
successful run. What is more, using MediaMill Challenge and LSCOM
annotations, our visual-only approach generalizes to a set of 572
concept detectors. To handle such a large thesaurus in retrieval, an
engine is developed which automatically selects a set of relevant
concept detectors based on text matching, ontology querying, and
visual concept likelihood. The suggestion engine is evaluated as
part of the automatic search task and forms the entry point for our
interactive search experiments. For this task we experiment with two
browsers for interactive exploration: the well-known CrossBrowser
and the novel ForkBrowser. It was found that, while retrieval
performance varies substantially per topic, the ForkBrowser is able
to produce the same overall results as the CrossBrowser. However,
the ForkBrowser obtains top-performance for most topics with less
user interaction. Indicating the potential of this browser for
interactive search. Similar to previous years our best interactive
search runs yield high overall performance, ranking 3rd and 4th.
-
-
Cees G. M. Snoek, Jan C. van Gemert, Theo Gevers, Bouke Huurnink, Dennis C.
Koelma, Michiel van Liempt, Ork de Rooij, Koen E. A. van de Sande, Frank J.
Seinstra, Arnold W. M. Smeulders, Andrew H. C. Thean, Cor J. Veenman, and
Marcel Worring.
The MediaMill TRECVID 2006 semantic video search engine,
November 2006.
[ bib |
.pdf ]
In this paper we describe our TRECVID 2006 experiments. The MediaMill team
participated in two tasks: concept detection and search. For concept detection
we use the MediaMill Challenge as experimental platform. The MediaMill Challenge
divides the generic video indexing problem into a visual-only, textual-only,
early fusion, late fusion, and combined analysis experiment. We provide a baseline
implementation for each experiment together with baseline results, which we made
available for the TRECVID community. The Challenge package was downloaded more
than 80 times and we anticipate that it has been used by several teams for their
2006 submission. Our Challenge experiments focus specifically on visual-only
analysis of video (run id: B_MM). We extract image features, on global, regional,
and keypoint level, which we combine with various supervised learners. A late
fusion approach of visual-only analysis methods using geometric mean was our most
successful run. With this run we conquer the Challenge baseline by more than 50%.
Our concept detection experiments have resulted in the best score for three
concepts: i.e. desert, flag us, and charts. What is more,
using LSCOM annotations, our visual-only approach generalizes well to a set of
491 concept detectors. To handle such a large thesaurus in retrieval, an engine
is developed which automatically selects a set of relevant concept detectors based
on text matching and ontology querying. The suggestion engine is evaluated as part
of the automatic search task (run id: A-MM) and forms the entry point for our
interactive search experiments (run id: A-MM). Here we experiment with query by
object matching and two browsers for interactive exploration: the CrossBrowser and
the novel NovaBrowser. It was found that the NovaBrowser is able to produce the
same results as the CrossBrowser, but with less user interaction. Similar to
previous years our best interactive search runs yield top performance, ranking
2nd and 6th overall. Again a lot has been learned during this year's TRECVID
campaign, we highlight the most important lessons at the end of this paper.
-
-
Cees G. M. Snoek, Jan C. van Gemert, Jan-Mark Geusebroek, Bouke Huurnink,
Dennis C. Koelma, Giang P. Nguyen, Ork de Rooij, Frank J. Seinstra, Arnold
W. M. Smeulders, Cor J. Veenman, and Marcel Worring.
The MediaMill TRECVID 2005 semantic video search engine,
November 2005.
[ bib |
.pdf ]
In this paper we describe our TRECVID 2005 experiments. The UvA-MediaMill team
participated in four tasks. For the detection of camera work (runid: A_CAM) we
investigate the benefit of using a tessellation of detectors in combination with
supervised learning over a standard approach using global image information.
Experiments indicate that average precision results increase drastically,
especially for pan (+51%) and tilt (+28%). For concept detection we propose a
generic approach using our semantic pathfinder. Most important novelty compared
to last years system is the improved visual analysis using proto-concepts based
on Wiccest features. In addition, the path selection mechanism was extended.
Based on the semantic pathfinder architecture we are currently able to detect an
unprecedented lexicon of 101 semantic concepts in a generic fashion. We performed
a large set of experiments (runid: B_vA). The results show that an optimal
strategy for generic multimedia analysis is one that learns from the training
set on a per-concept basis which tactic to follow. Experiments also indicate
that our visual analysis approach is highly promising. The lexicon of 101
semantic concepts forms the basis for our search experiments (runid: B_2_A-MM).
We participated in automatic, manual (using only visual information), and
interactive search. The lexicon-driven retrieval paradigm aids substantially in
all search tasks. When coupled with interaction, exploiting several novel browsing
schemes of our semantic video search engine, results are excellent. We obtain a
top-3 result for 19 out of 24 search topics. In addition, we obtain the highest
mean average precision of all search participants. We exploited the technology
developed for the above tasks to explore the BBC rushes. Most intriguing result
is that from the lexicon of 101 visual-only models trained for news data 25
concepts perform reasonably well on BBC data also.
-
-
Cees G. M. Snoek, Marcel Worring, Jan-Mark Geusebroek, Dennis C. Koelma, and
Frank J. Seinstra.
The MediaMill TRECVID 2004 semantic video search engine,
November 2004.
[ bib |
.pdf ]
This year the UvA-MediaMill team participated in the Feature Extraction and Search
Task. We developed a generic approach for semantic concept classification using
the semantic value chain. The semantic value chain extracts concepts from video
documents based on three consecutive analysis links, named the content link, the
style link, and the context link. Various experiments within the analysis links
were performed, showing amongst others the merit of processing beyond key frames,
the value of style elements, and the importance of learning semantic context. For
all experiments a lexicon of 32 concepts was exploited, 10 of which are part of
the Feature Extraction Task. Top three system-based ranking in 8 out of the 10
benchmark concepts indicates that our approach is very promising. Apart from this,
the lexicon of 32 concepts proved very useful in an interactive search scenario
with our semantic video search engine, where we obtained the highest mean average
precision of all participants.
-
-
Alexander Hauptmann, Robert V. Baron, Ming-Yu Chen, Michael Christel, Pinar
Duygulu, Chang Huang, Rong Jin, Wei-Hao Lin, Dorbin Ng, Neema Moraveji,
Norman Papernick, Cees G. M. Snoek, George Tzanetakis, Jun Yang, Rong Yan,
and Howard D. Wactlar.
Informedia at TRECVID 2003: Analyzing and searching broadcast
news video, November 2003.
[ bib |
.pdf ]
-
-
Marcel Worring, Giang P.Nguyen, Laura Hollink, Jan van Gemert, and Dennis C.
Koelma.
Interactive search using indexing, filtering, browsing, and
ranking, November 2003.
[ bib |
.pdf ]
-
-
Jeroen Vendrig, Jurgen den Hartog, David van Leeuwen, Ioannis Patras, Stephan
Raaijmakers, Cees Snoek, Jeroen van Rest, and Marcel Worring.
TREC feature extraction by active learning, November 2002.
[ bib |
.pdf ]
-
-
Jan Baan, Alex van Ballegooij, Jan-Mark Geusebroek, Djoerd Hiemstra, Jurgen den
Hartog, Johan List, Cees Snoek, Ioannis Patras, Stephan Raaijmakers, Leon
Todoran, Jeroen Vendrig, Arjen de Vries, Thijs Westerveld, and Marcel
Worring.
Lazy users and automatic video retrieval tools in (the)
lowlands, November 2001.
[ bib |
.pdf ]