We present a novel approach for the extraction of text lines in handwritten documents using a Convolutional Neural Network to label document image patches as text lines or separators. We first process the document to identify the most suitable patch size on the basis of an overall text line distance estimation. Using this information, we then extract several patches to train the CNN model. Finally, we use the trained model to segment text lines. We have evaluated this technique on the public database Saint Gall and on a private collection provided by the Ancestry company.
Text Line Extraction in Handwritten Historical Documents / Capobianco, Samuele; Marinai, Simone. - STAMPA. - 733:(2017), pp. 68-79. (Intervento presentato al convegno 13th Italian Research Conference on Digital Libraries, IRCDL 2017 tenutosi a ita nel 2017) [10.1007/978-3-319-68130-6_6].
Text Line Extraction in Handwritten Historical Documents
Capobianco, Samuele;Marinai, Simone
2017
Abstract
We present a novel approach for the extraction of text lines in handwritten documents using a Convolutional Neural Network to label document image patches as text lines or separators. We first process the document to identify the most suitable patch size on the basis of an overall text line distance estimation. Using this information, we then extract several patches to train the CNN model. Finally, we use the trained model to segment text lines. We have evaluated this technique on the public database Saint Gall and on a private collection provided by the Ancestry company.File | Dimensione | Formato | |
---|---|---|---|
paper15.pdf
Accesso chiuso
Tipologia:
Pdf editoriale (Version of record)
Licenza:
Tutti i diritti riservati
Dimensione
9.68 MB
Formato
Adobe PDF
|
9.68 MB | Adobe PDF | Richiedi una copia |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.