We present a unified dataset for document Question-Answering (QA), which is obtained combining several public datasets related to Document AI and visually rich document understanding (VRDU). Our main contribution is twofold: on the one hand we reformulate existing Document AI tasks, such as Information Extraction (IE), into a Question-Answering task, making it a suitable resource for training and evaluating Large Language Models; on the other hand, we release the OCR of all the documents and include the exact position of the answer to be found in the document image as a bounding box. Using this dataset, we explore the impact of different prompting techniques (that might include bounding box information) on the performance of open-weight models, identifying the most effective approaches for document comprehension.
BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations / Giovannini, Simone; Coppini, Fabio; Gemelli, Andrea; Marinai, Simone. - In: INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION. - ISSN 1433-2833. - ELETTRONICO. - (2025), pp. 0-16. [10.1007/s10032-025-00563-5]
BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations
Giovannini, Simone;Marinai, Simone
2025
Abstract
We present a unified dataset for document Question-Answering (QA), which is obtained combining several public datasets related to Document AI and visually rich document understanding (VRDU). Our main contribution is twofold: on the one hand we reformulate existing Document AI tasks, such as Information Extraction (IE), into a Question-Answering task, making it a suitable resource for training and evaluating Large Language Models; on the other hand, we release the OCR of all the documents and include the exact position of the answer to be found in the document image as a bounding box. Using this dataset, we explore the impact of different prompting techniques (that might include bounding box information) on the performance of open-weight models, identifying the most effective approaches for document comprehension.| File | Dimensione | Formato | |
|---|---|---|---|
|
s10032-025-00563-5.pdf
accesso aperto
Tipologia:
Pdf editoriale (Version of record)
Licenza:
Open Access
Dimensione
1.56 MB
Formato
Adobe PDF
|
1.56 MB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



