The MediaMill TRECVID 2006 Semantic Video Search Engine

Snoek, C. G. M.; van Gemert, J. C.; Gevers, T.; Huurnink, B.; Koelma, D. C.; van Liempt, M.; de Rooij, O.; van de Sande, K. E. A.; Seinstra, F. J.; Smeulders, A. W. M.; Thean, A. H. C.; Veenman, C. J.; Worring, M.

The MediaMill TRECVID 2006 Semantic Video Search Engine
C. G. M. Snoek, J. C. van Gemert, T. Gevers, B. Huurnink, D. C. Koelma, M. van Liempt, O. de Rooij, K. E. A. van de Sande, F. J. Seinstra, A. W. M. Smeulders, A. H. C. Thean, C. J. Veenman, M. Worring
In TRECVID Workshop 2006.
[bibtex] [pdf] [url]

Abstract

In this paper we describe our TRECVID 2006 experiments. The MediaMill team participated in two tasks: concept detection and search. For concept detection we use the MediaMill Challenge as experimental platform. The MediaMill Challenge divides the generic video indexing problem into a visual-only, textual-only, early fusion, late fusion, and combined analysis experiment. We provide a baseline implementation for each experiment together with baseline results, which we made available for the TRECVID community. The Challenge package was downloaded more than 80 times and we anticipate that it has been used by several teams for their 2006 submission. Our Challenge experiments focus specifically on visual-only analysis of video (run id: B\_MM). We extract image features, on global, regional, and keypoint level, which we combine with various supervised learners. A late fusion approach of visual-only analysis methods using geometric mean was our most successful run. With this run we conquer the Challenge baseline by more than 50\%. Our concept detection experiments have resulted in the best score for three concepts: i.e. \emph{desert}, emph{flag us}, and \emph{charts}. What is more, using LSCOM annotations, our visual-only approach generalizes well to a set of 491 concept detectors. To handle such a large thesaurus in retrieval, an engine is developed which automatically selects a set of relevant concept detectors based on text matching and ontology querying. The suggestion engine is evaluated as part of the automatic search task (run id: A-MM) and forms the entry point for our interactive search experiments (run id: A-MM). Here we experiment with query by object matching and two browsers for interactive exploration: the CrossBrowser and the novel NovaBrowser. It was found that the NovaBrowser is able to produce the same results as the CrossBrowser, but with less user interaction. Similar to previous years our best interactive search runs yield top performance, ranking 2nd and 6th overall. Again a lot has been learned during this year�s TRECVID campaign, we highlight the most important lessons at the end of this paper.

Bibtex Entry

@InProceedings{SnoekPTRECVID2006,
  author       = "Snoek, C. G. M. and van Gemert, J. C. and Gevers, T. and Huurnink, B.
                  and Koelma, D. C. and van Liempt, M. and de Rooij, O. and van de Sande, K. E. A.
                  and Seinstra, F. J. and Smeulders, A. W. M. and Thean, A. H. C. and Veenman, C. J.
                  and Worring, M.",
  title        = "The MediaMill TRECVID 2006 Semantic Video Search Engine",
  booktitle    = "TRECVID Workshop",
  year         = "2006",
  url          = "https://ivi.fnwi.uva.nl/isis/publications/2006/SnoekPTRECVID2006",
  pdf          = "https://ivi.fnwi.uva.nl/isis/publications/2006/SnoekPTRECVID2006/SnoekPTRECVID2006.pdf",
  has_image    = 1
}