In this work we present a preliminary version of a comprehensive interface for supporting users to interact with scholarly documents, enabling multi-layered exploration and offering deeper insights by integrating diverse features and contextual information. By bridging diverse information our work pursues the identification, characterization, and linking of visual elements to semantic and context data, leveraging large language models for interoperability. Recent advances in retrieval augmented generation are also exploited to address some language models limitations, allowing them to access latent information from document representations such as graph and vector embeddings. The system under development performs an analysis of input documents and enables the extraction of visual and semantic features, making them accessible in a comprehensive framework. The association of structural information to visual data allows formal analysis of documents and is exploited in our model to enhance visual extraction, performing a novel ontology-based constraint violation detection. The information extracted through this framework is semantically explorable, providing access to the document structure, which can be exploited in many applications like question answering and document understanding.

An Integrated System for Interacting with Multi-Page Scholarly Documents / Lorenzo Massai;Simone Marinai. - ELETTRONICO. - 3937:(2025), pp. 14-26. ( Information and Research science Connecting to Digital and Library science Udine February 20-21, 2025.).

An Integrated System for Interacting with Multi-Page Scholarly Documents

Lorenzo Massai
;
Simone Marinai
2025

Abstract

In this work we present a preliminary version of a comprehensive interface for supporting users to interact with scholarly documents, enabling multi-layered exploration and offering deeper insights by integrating diverse features and contextual information. By bridging diverse information our work pursues the identification, characterization, and linking of visual elements to semantic and context data, leveraging large language models for interoperability. Recent advances in retrieval augmented generation are also exploited to address some language models limitations, allowing them to access latent information from document representations such as graph and vector embeddings. The system under development performs an analysis of input documents and enables the extraction of visual and semantic features, making them accessible in a comprehensive framework. The association of structural information to visual data allows formal analysis of documents and is exploited in our model to enhance visual extraction, performing a novel ontology-based constraint violation detection. The information extracted through this framework is semantically explorable, providing access to the document structure, which can be exploited in many applications like question answering and document understanding.
2025
Proceedings of IRCDL 2025
Information and Research science Connecting to Digital and Library science
Udine
February 20-21, 2025.
Lorenzo Massai;Simone Marinai
File in questo prodotto:
File Dimensione Formato  
Massai_IRCDL25_article.pdf

accesso aperto

Tipologia: Pdf editoriale (Version of record)
Licenza: Creative commons
Dimensione 2.68 MB
Formato Adobe PDF
2.68 MB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1425492
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact