Effective video retrieval is the result of an interplay between interactive query selection, advanced visualization of results, and a goal-oriented human user. Traditional interactive video retrieval approaches emphasize paradigms, such as query-by-keyword and query-by-example, to aid the user in the search for relevant footage. However, recent results in automatic indexing indicate that query-by-concept is becoming a viable resource for interactive retrieval also. We propose in this paper a new video retrieval paradigm. The core of the paradigm is formed by first detecting a large lexicon of semantic concepts. From there, we combine query-by-concept, query-by-example, query-by-keyword, and user interaction into the \emph{MediaMill} semantic video search engine. To measure the impact of increasing lexicon size on interactive video retrieval performance, we performed two experiments against the 2004 and 2005 NIST TRECVID benchmarks, using lexicons containing 32 and 101 concepts respectively. The results suggest that from all factors that play a role in interactive retrieval, a large lexicon of semantic concepts matters most. Indeed, by exploiting large lexicons, many video search questions are solvable without using query-by-keyword and query-by-example. What is more, we show that the lexicon-driven search engine outperforms all state-of-the-art video retrieval systems in both TRECVID 2004 and 2005.
@Article{SnoekITM2007,
author = "Snoek, C. G. M. and Worring, M. and Koelma, D. C. and Smeulders, A. W. M.",
title = "A Learned Lexicon-Driven Paradigm for Interactive Video Retrieval",
journal = "IEEE Transactions on Multimedia",
number = "2",
volume = "9",
pages = "280--292",
year = "2007",
url = "https://ivi.fnwi.uva.nl/isis/publications/2007/SnoekITM2007",
pdf = "https://ivi.fnwi.uva.nl/isis/publications/2007/SnoekITM2007/SnoekITM2007.pdf",
has_image = 1
}