The contribution is the last chapter of the Volume matching the DVD publication of the Spoken romance corpus C-ORAL-ROM. The C-ORAL-ROM resource is a multilingual corpus of spontaneous speech for the main romance languages (French, Italian, Portuguese, Spanish), comprised of around 1,200,000 words and 124 hours of speech and integrated with tools for the exploitation of linguistic information at the textual and acoustic levels. C-ORAL-ROM is the main outcome of an EU project in the IST program of the 5th Framework Program (IST2000-26228), coordinated by Emanuela Cresti; it is distributed through ELDA and Benjamin Publishing Company, achieving a large diffusion in the scientific community. After a general premise on the comparison between spoken and written texts, this paper develops a detailed corpus based description of the most relevant linguistic strategies concerning speech lexicon and syntax shared by the four romance languages. Among the topics we consider are quantitative data on the percentage of verbs and nouns (i.e. the major verb usage in accordance with the variation of diaphasia in the corpus design) in comparison with a substantial homogeneity across the four languages, the high occurrence of verbless utterances, and some general tendencies of utterance information structure with the highest percentage being simple utterances followed by the topic-comment pattern. The final part of the paper is devoted to the analysis of speech coordination, subordination and negation through the automatic retrieval of the most common coordinative and subordinative conjunctions and negative adverbs, along with their distribution. In conclusion, the percentage data and the specific strategies of construction emerging from corpora analysis allow us to hypothesize a new perspective in the study of spoken language

Notes on lexical strategies, structural strategies and surface clause indexes in the C-ORAL-ROM spoken corpora / E. CRESTI. - STAMPA. - (2005), pp. 209-256.

Notes on lexical strategies, structural strategies and surface clause indexes in the C-ORAL-ROM spoken corpora

CRESTI, EMANUELA
2005

Abstract

The contribution is the last chapter of the Volume matching the DVD publication of the Spoken romance corpus C-ORAL-ROM. The C-ORAL-ROM resource is a multilingual corpus of spontaneous speech for the main romance languages (French, Italian, Portuguese, Spanish), comprised of around 1,200,000 words and 124 hours of speech and integrated with tools for the exploitation of linguistic information at the textual and acoustic levels. C-ORAL-ROM is the main outcome of an EU project in the IST program of the 5th Framework Program (IST2000-26228), coordinated by Emanuela Cresti; it is distributed through ELDA and Benjamin Publishing Company, achieving a large diffusion in the scientific community. After a general premise on the comparison between spoken and written texts, this paper develops a detailed corpus based description of the most relevant linguistic strategies concerning speech lexicon and syntax shared by the four romance languages. Among the topics we consider are quantitative data on the percentage of verbs and nouns (i.e. the major verb usage in accordance with the variation of diaphasia in the corpus design) in comparison with a substantial homogeneity across the four languages, the high occurrence of verbless utterances, and some general tendencies of utterance information structure with the highest percentage being simple utterances followed by the topic-comment pattern. The final part of the paper is devoted to the analysis of speech coordination, subordination and negation through the automatic retrieval of the most common coordinative and subordinative conjunctions and negative adverbs, along with their distribution. In conclusion, the percentage data and the specific strategies of construction emerging from corpora analysis allow us to hypothesize a new perspective in the study of spoken language
2005
9781588115485
C-ORAL-ROM. Integrated reference corpora for spoken romance languages
209
256
E. CRESTI
File in questo prodotto:
File Dimensione Formato  
cresti-strategies-1.pdf

Accesso chiuso

Tipologia: Versione finale referata (Postprint, Accepted manuscript)
Licenza: Tutti i diritti riservati
Dimensione 1.84 MB
Formato Adobe PDF
1.84 MB Adobe PDF   Richiedi una copia

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/227835
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact