When comparing document images based on
visual similarity it is difficult to determine the correct
scale and features for document representation. We report
on a new form of multivariate granulometries based
on rectangles of varying size and aspect ratio. These
rectangular granulometries are used to probe the layout
structure of document images, and the rectangular size
distributions derived from them are used as descriptors
for document images. Feature selection is used to reduce
the dimensionality and redundancy of the size distributions
while preserving the essence of the visual appearance
of a document. Experimental results indicate that
rectangular size distributions are an effective way to characterize
visual similarity of document images and provide
insightful interpretation of classification and retrieval results
in the original image space rather than the abstract
feature space.
@Article{BagdanovIJDAR2004,
author = "Bagdanov, A. and Worring, M.",
title = "Multiscale Document Description Using Rectangular Granulometries",
journal = "International Journal on Document Analysis and Recognition",
number = "3",
volume = "6",
pages = "181--191",
year = "2004",
url = "https://ivi.fnwi.uva.nl/isis/publications/2004/BagdanovIJDAR2004",
pdf = "https://ivi.fnwi.uva.nl/isis/publications/2004/BagdanovIJDAR2004/BagdanovIJDAR2004.pdf"
}