The LABLITA Speech Resources

Cresti, Emanuela; Gregori, Lorenzo; Moneglia, Massimo; Nicolas Martinez, Maria Carlota; Panunzi, Alessandro

doi:10.17469/O2106SLI000005

The LABLITA lab of the University of Florence makes available on the web three main spoken corpora: the LABLITA reference corpus of spoken Italian, the IPIC cross-linguistic database of information structure, the C-Or-DiAL Spoken Spanish corpus for teaching Spanish L2. These resources have been annotated following the Language into Act Theory (Cresti 2000) for what regards prosody and its relationship with pragmatics and information structure, and present the speech flow segmented into utterances and information units in correspondence with perceptively relevant prosodic breaks. The LABLITA corpus gives an account of the diaphasic variation of the Italian language spoken in Tuscany according to a detailed corpus design. DB-IPIC, based on a heavily annotated sub-corpus of the LABLITA corpus and comparable Spanish and Brazilian Portuguese corpora, allows the user to retrieve from corpora how information is structured in spontaneous speech, observing how information structure can vary cross-linguistically. C-Or-DiAL proposes to teachers and learners of Spanish L2 a dedicated resource for integrating speech into the learning activities.

The LABLITA Speech Resources / Emanuela Cresti, L.G.. - ELETTRONICO. - (2022), pp. 85-108. [10.17469/O2106SLI000005]