This paper contributes to the automatic indexing of concert video. In contrast to traditional methods, which rely primarily on audio information for summarization applications, we explore how a visual-only concept detection approach could be employed. We investigate how our recent method for news video indexing — which takes into account the role of content and style — generalizes to the concert domain. We analyze concert video on three levels of visual abstraction, namely: content, style, and their fusion. Experiments with 12 concept detectors, on 45 hours of visually challenging concert video, show that the automatically learned best approach is concept-dependent. Moreover, these results suggest that the visual modality provides ample opportunity for more effective indexing and retrieval of concert video when used in addition to the auditory modality.
@InProceedings{SnoekICME2007a,
author = "Snoek, C. G. M. and Worring, M. and Smeulders, A. W. M. and Freiburg, B.",
title = "The Role of Visual Content and Style for Concert Video Indexing",
booktitle = "IEEE International Conference on Multimedia \& Expo",
pages = "252--255",
year = "2007",
url = "https://ivi.fnwi.uva.nl/isis/publications/2007/SnoekICME2007a",
pdf = "https://ivi.fnwi.uva.nl/isis/publications/2007/SnoekICME2007a/SnoekICME2007a.pdf",
has_image = 1
}