Social image retrieval is important for exploiting the increasing amounts of amateur-tagged multimedia such as Flickr images. Since amateur tagging is known to be uncontrolled, ambiguous, and personalized, a fundamental problem is how to reliably interpret the relevance of a tag with respect to the visual content it is describing. Intuitively, if different persons label similar images
using the same tags, these tags are likely to reflect objective aspects of the visual content. Starting from this intuition, we propose a novel algorithm that scalably and reliably learns tag relevance by accumulating votes from visually similar neighbors. Further, treated as tag frequency, learned tag relevance is seamlessly embedded into current tag-based social image retrieval paradigms.
Preliminary experiments on one million Flickr images demonstrate the potential of the proposed algorithm. Overall comparisons for both single-word queries and multiple-word queries show substantial improvement over the baseline by learning and using tag relevance. Specifically, compared with the baseline using the original tags, on average, retrieval using improved tags increases mean average
precision by 24%, from 0.54 to 0.67. Moreover, simulated
experiments indicate that performance can be improved further by scaling up the amount of images used in the proposed neighbor voting algorithm.
@InProceedings{LiICMIR2008,
author = "Li, X. and Snoek, C. G. M. and Worring, M.",
title = "Learning Tag Relevance by Neighbor Voting for Social Image Retrieval",
booktitle = "ACM International Conference on Multimedia Information Retrieval",
pages = "180--187",
year = "2008",
url = "https://ivi.fnwi.uva.nl/isis/publications/2008/LiICMIR2008",
pdf = "https://ivi.fnwi.uva.nl/isis/publications/2008/LiICMIR2008/LiICMIR2008.pdf",
has_image = 1
}