This paper presents the digitization and online dissemination of the Manzini & Savoia (2005) corpus, one of the most comprehensive resources on morphosyntactic variation in Italian and Romansh dialects. Developed within Project CHANGES (Cultural Heritage Active Innovation for Sustainable Society), the initiative responds to the urgent need for systematic documentation and open-access preservation of linguistic diversity. The new platform integrates a relational PostgreSQL database, a Strapi-based backend, and an interactive web interface, offering multiple modes of exploration—including map-based navigation, morphosyntactic query, and access to original fieldwork notebooks. The entire dataset (64,472 examples with IPA transcription and metadata) is openly available on Zenodo for independent research and reuse. The project also explores experimental applications of Large Language Models (LLMs) for automatic annotation, demonstrating the potential for computational approaches in dialectology. This work provides a replicable model for sustainable digital archiving and fosters interdisciplinary research across linguistic, computational, and cultural heritage domains.
Morphosyntactic Variation in Italian and Romansh Dialects: The Manzini & Savoia (2005) Corpus Within Project CHANGES / Greta Mazzaggio, Carlo Zoli, Neri Binazzi, Luca Andrea Ludovico, Mael Vittorio Vena, Maria Rita Manzini, Leonardo Maria Savoia. - ELETTRONICO. - (2025), pp. 1-8. (Intervento presentato al convegno Digital Heritage 2025).
Morphosyntactic Variation in Italian and Romansh Dialects: The Manzini & Savoia (2005) Corpus Within Project CHANGES
Greta Mazzaggio
;Neri Binazzi;Maria Rita Manzini;Leonardo Maria Savoia
2025
Abstract
This paper presents the digitization and online dissemination of the Manzini & Savoia (2005) corpus, one of the most comprehensive resources on morphosyntactic variation in Italian and Romansh dialects. Developed within Project CHANGES (Cultural Heritage Active Innovation for Sustainable Society), the initiative responds to the urgent need for systematic documentation and open-access preservation of linguistic diversity. The new platform integrates a relational PostgreSQL database, a Strapi-based backend, and an interactive web interface, offering multiple modes of exploration—including map-based navigation, morphosyntactic query, and access to original fieldwork notebooks. The entire dataset (64,472 examples with IPA transcription and metadata) is openly available on Zenodo for independent research and reuse. The project also explores experimental applications of Large Language Models (LLMs) for automatic annotation, demonstrating the potential for computational approaches in dialectology. This work provides a replicable model for sustainable digital archiving and fosters interdisciplinary research across linguistic, computational, and cultural heritage domains.| File | Dimensione | Formato | |
|---|---|---|---|
|
dh20253066.pdf
accesso aperto
Tipologia:
Pdf editoriale (Version of record)
Licenza:
Creative commons
Dimensione
21.82 MB
Formato
Adobe PDF
|
21.82 MB | Adobe PDF |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



