In this paper, the Earth Mover’s Distance (EMD) is used as a similarity measure in the mathematical symbol retrieval task. The approach is based on the Bag-of-Visual-Words model. In our case the features extracted from each symbol are clustered by means of Self-Organizing Maps (SOM) and then occurrences of features in the clusters are accumulated in a vector of visual words. The comparison between the latter vectors is performed with the EMD which naturally allows to incorporate the topological organization of SOM clusters in the distance computation. The proposed approach is experimentally tested in a mathematical symbol retrieval task and compared with the cosine similarity and with some variants that have been recently proposed.
Using Earth Mover's Distance in the Bag-of-Visual-Words Model for Mathematical Symbol Retrieval / Simone Marinai;Beatrice Miotti;Giovanni Soda. - STAMPA. - (2011), pp. 1309-1313. (Intervento presentato al convegno International Conference on Document Analysis and Recognition tenutosi a Beijing (China) nel Sept. 2011) [10.1109/ICDAR.2011.263].
Using Earth Mover's Distance in the Bag-of-Visual-Words Model for Mathematical Symbol Retrieval
MARINAI, SIMONE;MIOTTI, BEATRICE;SODA, GIOVANNI
2011
Abstract
In this paper, the Earth Mover’s Distance (EMD) is used as a similarity measure in the mathematical symbol retrieval task. The approach is based on the Bag-of-Visual-Words model. In our case the features extracted from each symbol are clustered by means of Self-Organizing Maps (SOM) and then occurrences of features in the clusters are accumulated in a vector of visual words. The comparison between the latter vectors is performed with the EMD which naturally allows to incorporate the topological organization of SOM clusters in the distance computation. The proposed approach is experimentally tested in a mathematical symbol retrieval task and compared with the cosine similarity and with some variants that have been recently proposed.File | Dimensione | Formato | |
---|---|---|---|
icdar11_miotti-CR.pdf
accesso aperto
Tipologia:
Versione finale referata (Postprint, Accepted manuscript)
Licenza:
Tutti i diritti riservati
Dimensione
231.2 kB
Formato
Adobe PDF
|
231.2 kB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.