Document layout analysis is an important task to extract information from scientific literature. Deep-learning solutions for document layout analysis require large collections of training data that are not always available. We generate a large number of synthetic pages to subsequently train a neural network to perform document object detection. The proposed pipeline allows users to deal with less common layouts for which it is not easy to find large annotated datasets. High-quality annotations for a small collection of papers are obtained through a semi-automatic approach. Then, a generative model, based on LayoutTransformer, is used to generate plausible layouts that are subsequently populated with random information to perform data augmentation. We evaluate the proposed method considering scientific articles with two different types of layouts: double and single columns. For double-column papers, we improve detection by 1% starting from 385 manually annotated scientific articles. For single-column papers, we improve detection by 49% starting from 218 articles.
Automatic generation of scientific papers for data augmentation in document layout analysis / Pisaneschi, Lorenzo; Gemelli, Andrea; Marinai, Simone. - In: PATTERN RECOGNITION LETTERS. - ISSN 0167-8655. - ELETTRONICO. - 167:(2023), pp. 38-44. [10.1016/j.patrec.2023.01.018]
Automatic generation of scientific papers for data augmentation in document layout analysis
Pisaneschi, Lorenzo;Gemelli, Andrea;Marinai, Simone
2023
Abstract
Document layout analysis is an important task to extract information from scientific literature. Deep-learning solutions for document layout analysis require large collections of training data that are not always available. We generate a large number of synthetic pages to subsequently train a neural network to perform document object detection. The proposed pipeline allows users to deal with less common layouts for which it is not easy to find large annotated datasets. High-quality annotations for a small collection of papers are obtained through a semi-automatic approach. Then, a generative model, based on LayoutTransformer, is used to generate plausible layouts that are subsequently populated with random information to perform data augmentation. We evaluate the proposed method considering scientific articles with two different types of layouts: double and single columns. For double-column papers, we improve detection by 1% starting from 385 manually annotated scientific articles. For single-column papers, we improve detection by 49% starting from 218 articles.File | Dimensione | Formato | |
---|---|---|---|
1-s2.0-S0167865523000247-main.pdf
accesso aperto
Tipologia:
Pdf editoriale (Version of record)
Licenza:
Open Access
Dimensione
2.31 MB
Formato
Adobe PDF
|
2.31 MB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.