We present an ongoing effort to bridge the Lessico dei Beni Culturali (LBC), a multilingual lexicographic project covering Italian cultural heritage terminology, with the Linguistic Linked Open Data (LLOD) ecosystem. The LBC corpus spans five centuries of art-historical writing, from fifteenth- and sixteenth-century treatises by Alberti, Leonardo, and Vasari to nineteenth-century works by Stendhal and Burckhardt and contemporary tourist guides to Florence, with source texts in several European languages alongside their translations. The resource has already undergone automatic linguistic annotation and term extraction, but lacks structured lexical representation in any standard LLOD formalism. We describe the current state of the resource, identify the main challenges for its publication as Linked Data — including the modelling of culturally-bound terms (realia), historical proper nouns, and multilingual source texts of different registers — and outline a roadmap towards its representation in OntoLex-Lemon (McCrae et al., 2017) and its alignment with existing LLOD resources such as the Getty Vocabularies (Getty Research Institute, 2024a) and Wikidata (Vrandecic and Krötzsch, 2014). By sharing this work with the LLOD community, we expect input on best practices for historical-artistic and cultural heritage lexicons that will raise interoperability between resources from different sources, generating new information and increasing the value of existing data.

Towards a Linguistic Linked Open Data Resource for Italian Cultural Heritage: The Lessico dei Beni Culturali Corpus / Riccardo Billero. - ELETTRONICO. - (2026), pp. 22-28. ( 10th Workshop on Linked Data in Linguistics (LDL-2026) @ LREC 2026 Palma di Maiorca 12 maggio 2026).

Towards a Linguistic Linked Open Data Resource for Italian Cultural Heritage: The Lessico dei Beni Culturali Corpus

Riccardo Billero
2026

Abstract

We present an ongoing effort to bridge the Lessico dei Beni Culturali (LBC), a multilingual lexicographic project covering Italian cultural heritage terminology, with the Linguistic Linked Open Data (LLOD) ecosystem. The LBC corpus spans five centuries of art-historical writing, from fifteenth- and sixteenth-century treatises by Alberti, Leonardo, and Vasari to nineteenth-century works by Stendhal and Burckhardt and contemporary tourist guides to Florence, with source texts in several European languages alongside their translations. The resource has already undergone automatic linguistic annotation and term extraction, but lacks structured lexical representation in any standard LLOD formalism. We describe the current state of the resource, identify the main challenges for its publication as Linked Data — including the modelling of culturally-bound terms (realia), historical proper nouns, and multilingual source texts of different registers — and outline a roadmap towards its representation in OntoLex-Lemon (McCrae et al., 2017) and its alignment with existing LLOD resources such as the Getty Vocabularies (Getty Research Institute, 2024a) and Wikidata (Vrandecic and Krötzsch, 2014). By sharing this work with the LLOD community, we expect input on best practices for historical-artistic and cultural heritage lexicons that will raise interoperability between resources from different sources, generating new information and increasing the value of existing data.
2026
Proceedings of the 10th Workshop on Linked Data in Linguistics (LDL-2026) @ LREC 2026
10th Workshop on Linked Data in Linguistics (LDL-2026) @ LREC 2026
Palma di Maiorca
12 maggio 2026
Riccardo Billero
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1470955
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact