In Document Image Analysis (DIA), which deals with solutions to obtain computer-readable description from document images, understanding and recognition of a wide spectrum of complex document images from business and financial documents to floor plans pose a key challenge due to high-level semantic information carried in such documents. The primary task is then to isolate different present contents in the documents (e.g., graphical and textual components). In this thesis, the main objective is the recognition and understanding of graphical documents in order to generate accessible graphical documents using Deep learning-based object detection models. To do so, first, the object detection in floor plans is addressed by creating and extending floor plan data sets, and then, proposing reliable detection approaches to suitably operate in real scenarios. Second, the role of transcript alignment in early printed loosely annotated texts to support word detection inside unknown images is investigated.

Deep Learning-based Object Detection Models applied to Document Images / Zahra Ziran. - (2020).

Deep Learning-based Object Detection Models applied to Document Images

Zahra Ziran
2020

Abstract

In Document Image Analysis (DIA), which deals with solutions to obtain computer-readable description from document images, understanding and recognition of a wide spectrum of complex document images from business and financial documents to floor plans pose a key challenge due to high-level semantic information carried in such documents. The primary task is then to isolate different present contents in the documents (e.g., graphical and textual components). In this thesis, the main objective is the recognition and understanding of graphical documents in order to generate accessible graphical documents using Deep learning-based object detection models. To do so, first, the object detection in floor plans is addressed by creating and extending floor plan data sets, and then, proposing reliable detection approaches to suitably operate in real scenarios. Second, the role of transcript alignment in early printed loosely annotated texts to support word detection inside unknown images is investigated.
2020
Prof. Simone Marinai
IRAN
Zahra Ziran
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis_ZahraZiran.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Open Access
Dimensione 9.76 MB
Formato Adobe PDF
9.76 MB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1189758
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact