Introduction: The translation of paintings into tactile 2.5D models (i.e., bas-reliefs) represents a significant advancement in improving accessibility for blind and visually impaired individuals. However, reconstructing spatial structure from a single painted image without explicit perspective is inherently ill-posed, particularly in modern and contemporary artworks where perspective, illumination, and geometry deviate from physical realism. Methods: This study presents a comparative evaluation of three AI-based reconstruction paradigms: Monocular Depth Estimation, Large Language Models, and Large Reconstruction Models. These approaches are applied to a selected corpus of photographic, realist, and abstract artworks from the CSAC collection (Parma, Italy). An assessment framework is introduced, combining expert-based qualitative evaluation by art historians, formal geometric verification (including integrability and topological consistency), and manufacturability analysis conducted by additive manufacturing specialists. Results: The results indicate that Large Language Model-based methods generate semantically rich and perceptually plausible bas-reliefs but lack geometric integrability and topological robustness. Monocular Depth Models perform well in capturing depth hierarchies but tend to oversmooth fine details. Large Reconstruction Models demonstrate strong structural coherence and fabrication readiness, though they often struggle with stylistic reinterpretation. Discussion: These findings highlight the trade-offs among current AI-based reconstruction approaches for tactile bas-relief generation. While each paradigm excels in specific aspects, none achieves a complete balance between perceptual fidelity, geometric soundness, and manufacturability. Future work should focus on hybrid strategies that integrate semantic understanding with geometric consistency to better support accessible cultural heritage applications.

From pictorial space to tactile form: a comparative evaluation of AI-based 2.5D reconstruction from modern artwork paintings / Furferi, Rocco; Governi, Lapo; Volpe, Yary; Servi, Michaela; Buonamici, Francesco. - In: FRONTIERS IN COMPUTER SCIENCE. - ISSN 2624-9898. - ELETTRONICO. - 8:(2026), pp. 0-0. [10.3389/fcomp.2026.1821454]

From pictorial space to tactile form: a comparative evaluation of AI-based 2.5D reconstruction from modern artwork paintings

Furferi, Rocco;Governi, Lapo;Volpe, Yary;Servi, Michaela;Buonamici, Francesco
2026

Abstract

Introduction: The translation of paintings into tactile 2.5D models (i.e., bas-reliefs) represents a significant advancement in improving accessibility for blind and visually impaired individuals. However, reconstructing spatial structure from a single painted image without explicit perspective is inherently ill-posed, particularly in modern and contemporary artworks where perspective, illumination, and geometry deviate from physical realism. Methods: This study presents a comparative evaluation of three AI-based reconstruction paradigms: Monocular Depth Estimation, Large Language Models, and Large Reconstruction Models. These approaches are applied to a selected corpus of photographic, realist, and abstract artworks from the CSAC collection (Parma, Italy). An assessment framework is introduced, combining expert-based qualitative evaluation by art historians, formal geometric verification (including integrability and topological consistency), and manufacturability analysis conducted by additive manufacturing specialists. Results: The results indicate that Large Language Model-based methods generate semantically rich and perceptually plausible bas-reliefs but lack geometric integrability and topological robustness. Monocular Depth Models perform well in capturing depth hierarchies but tend to oversmooth fine details. Large Reconstruction Models demonstrate strong structural coherence and fabrication readiness, though they often struggle with stylistic reinterpretation. Discussion: These findings highlight the trade-offs among current AI-based reconstruction approaches for tactile bas-relief generation. While each paradigm excels in specific aspects, none achieves a complete balance between perceptual fidelity, geometric soundness, and manufacturability. Future work should focus on hybrid strategies that integrate semantic understanding with geometric consistency to better support accessible cultural heritage applications.
2026
8
0
0
Furferi, Rocco; Governi, Lapo; Volpe, Yary; Servi, Michaela; Buonamici, Francesco
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1464993
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact