In this paper, we propose the combination of the self organizing map (SOM) and of the tangent distance for effective clustering in document image analysis. The proposed model (SOM_TD) is used for character and layout clustering, with applications to word retrieval and to page classification. By using the tangent distance it is possible to improve the SOM clustering so as to be more tolerant with respect to small local transformations of the input patterns.
Transformation invariant SOM clustering in Document Image Analysis / S. Marinai; E. Marino; G. Soda. - STAMPA. - (2007), pp. 185-190. (Intervento presentato al convegno 14th International Conference on Image Analysis and Processing, 2007. ICIAP 2007. tenutosi a MODENA nel september 10-14 2007) [10.1109/ICIAP.2007.4362777].
Transformation invariant SOM clustering in Document Image Analysis
MARINAI, SIMONE;SODA, GIOVANNI
2007
Abstract
In this paper, we propose the combination of the self organizing map (SOM) and of the tangent distance for effective clustering in document image analysis. The proposed model (SOM_TD) is used for character and layout clustering, with applications to word retrieval and to page classification. By using the tangent distance it is possible to improve the SOM clustering so as to be more tolerant with respect to small local transformations of the input patterns.I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.