In order to resolve the mismatch between user needs and current image retrieval techniques,
we conducted a study to get more information about what users look for in images. First, we
developed a framework for the classification of image descriptions by users, based on various
classification methods from the literature. The classification framework distinguishes three
related viewpoints on images, namely nonvisual metadata, perceptual descriptions and
conceptual descriptions. For every viewpoint a set of descriptive classes and relations is
specified. We used the framework in an empirical study, in which image descriptions were
formulated by 30 participants. The resulting descriptions were split into fragments and
categorized in the framework. The results suggest that users prefer general descriptions as
opposed to specific or abstract descriptions. Frequently used categories were objects, events
and relations between objects in the image.
@Article{HollinkIJHCS2004,
author = "Hollink, L. and Schreiber, A. T. and Wielinga, B. and Worring, M.",
title = "Classification of User Image Descriptions",
journal = "International Journal of Human Computer Studies",
number = "5",
volume = "61",
pages = "601--626",
year = "2004",
url = "https://ivi.fnwi.uva.nl/isis/publications/2004/HollinkIJHCS2004",
pdf = "https://ivi.fnwi.uva.nl/isis/publications/2004/HollinkIJHCS2004/HollinkIJHCS2004.pdf"
}