The Microsoft SenseCam is a small lightweight wearable camera used to passively capture photos and other sensor readings from a user’s day-to-day activities. It captures on average 3,000 images in a typical day, equating to almost 1 million images per year. It can be used to aid memory by creating a personal multimedia lifelog, or visual recording of the wearer’s life. However the sheer volume of image data captured within a visual lifelog creates a number of challenges, particularly for locating relevant content. Within this work, we explore the applicability of semantic concept detection, a method often used within video retrieval, on the domain of visual lifelogs. Our concept detector models the correspondence between low-level visual features and high-level semantic concepts (such as indoors, outdoors, people, buildings, etc.) using supervised machine learning. By doing so it determines the probability of a concept’s presence. We apply detection of 27 everyday semantic concepts on a lifelog collection composed of 257,518 SenseCam images from 5 users. The results were evaluated on a subset of 95,907 images, to determine the accuracy for detection of each semantic concept. We conducted further analysis on the temporal consistency, co-occurance and relationships within the detected concepts to more extensively investigate the robustness of the detectors within this domain.
@Article{ByrneMTA2010,
author = "Byrne, D. and Doherty, A. R. and Snoek, C. G. M. and Jones, G. J. F.
and Smeaton, A. F.",
title = "Everyday Concept Detection in Visual Lifelogs: Validation, Relationships and Trends",
journal = "Multimedia Tools and Applications",
number = "1",
volume = "49",
pages = "119--144",
year = "2010",
url = "https://ivi.fnwi.uva.nl/isis/publications/2010/ByrneMTA2010",
pdf = "https://ivi.fnwi.uva.nl/isis/publications/2010/ByrneMTA2010/ByrneMTA2010.pdf",
has_image = 1
}