When comparing document images based on visual similarity it is difficult to determine the correct scale and features for document representation. We report on a new form of multivariate granulometries based on rectangles of varying size and aspect ratio. These rectangular granulometries are used to probe the layout structure of document images, and the rectangular size distributions derived from them are used as descriptors for document images. Feature selection is used to reduce the dimensionality and redundancy of the size distributions while preserving the essence of the visual appearance of a document. Experimental results indicate that rectangular size distributions are an effective way to characterize visual similarity of document images and provide insightful interpretation of classification and retrieval results in the original image space rather than the abstract feature space.
Multiscale document description using rectangular granulometries / Bagdanov, Andrew D.*; Worring, Marcel. - In: INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION. - ISSN 1433-2833. - STAMPA. - 6:(2003), pp. 181-191. [10.1007/s10032-003-0112-1]
Multiscale document description using rectangular granulometries
Bagdanov, Andrew D.
;
2003
Abstract
When comparing document images based on visual similarity it is difficult to determine the correct scale and features for document representation. We report on a new form of multivariate granulometries based on rectangles of varying size and aspect ratio. These rectangular granulometries are used to probe the layout structure of document images, and the rectangular size distributions derived from them are used as descriptors for document images. Feature selection is used to reduce the dimensionality and redundancy of the size distributions while preserving the essence of the visual appearance of a document. Experimental results indicate that rectangular size distributions are an effective way to characterize visual similarity of document images and provide insightful interpretation of classification and retrieval results in the original image space rather than the abstract feature space.I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.