We start from the state-of-the-art Bag of Words pipeline
that in the 2008 benchmarks of TRECvid and PASCAL
yielded the best performance scores. We have contributed
to that pipeline, which now forms the basis to compare var-
ious fast alternatives for all of its components: (i) For de
scriptor extraction we propose a fast algorithm to densely
sample SIFT and SURF, and we compare several variants
of these descriptors. (ii) For descriptor projection we com-
pare a k-means visual vocabulary with a Random Forest.
As a preprojection step we experiment with PCA on the descriptors to decrease projection time. (iii) For classification
we use Support Vector Machines and compare the 2 kernel
with the RBF kernel. Our results lead to a 10-fold speed
increase without any loss of accuracy and to a 30-fold speed
increase with 17% loss of accuracy, where the latter system
does real-time classification at 26 images per second.
@InProceedings{UijlingsICIVR2009,
author = "Uijlings, J. R. R. and Smeulders, A. W. M. and Scha, R. J. H.",
title = "Real-Time Bag of Words, Approximately",
booktitle = "ACM International Conference on Image and Video Retrieval",
year = "2009",
url = "https://ivi.fnwi.uva.nl/isis/publications/2009/UijlingsICIVR2009",
pdf = "https://ivi.fnwi.uva.nl/isis/publications/2009/UijlingsICIVR2009/UijlingsICIVR2009.pdf",
has_image = 1
}