In this work we propose one deep architecture to identify text and not-text regions in historical handwritten documents. In particular we adopt the U-net architecture in combination with a suitable weighted loss function in order to put more emphasis on most critical areas.We define one weighted map to balance the pixel frequency among classes and to guide the training with local prior rules. In the experiments we evaluate the performance of the U-net architecture and of the weighted training on one benchmark dataset. We obtain good results using global metrics improving global and local classification scores.
Historical handwritten document segmentation by using a weighted loss / Capobianco, Samuele*; Scommegna, Leonardo; Marinai, Simone. - STAMPA. - 11081:(2018), pp. 395-406. (Intervento presentato al convegno 8th IAPR TC3 workshop on Artificial Neural Networks for Pattern Recognition, ANNPR 2018 tenutosi a ita nel 2018) [10.1007/978-3-319-99978-4_31].
Historical handwritten document segmentation by using a weighted loss
CAPOBIANCO, SAMUELE;Scommegna, Leonardo;Marinai, Simone
2018
Abstract
In this work we propose one deep architecture to identify text and not-text regions in historical handwritten documents. In particular we adopt the U-net architecture in combination with a suitable weighted loss function in order to put more emphasis on most critical areas.We define one weighted map to balance the pixel frequency among classes and to guide the training with local prior rules. In the experiments we evaluate the performance of the U-net architecture and of the weighted training on one benchmark dataset. We obtain good results using global metrics improving global and local classification scores.File | Dimensione | Formato | |
---|---|---|---|
capobiancoscommegna.pdf
accesso aperto
Descrizione: Camera ready prodotto dagli autori
Tipologia:
Versione finale referata (Postprint, Accepted manuscript)
Licenza:
Tutti i diritti riservati
Dimensione
3.47 MB
Formato
Adobe PDF
|
3.47 MB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.