NLP technologies and components have an increasing diffusion in mass analysis of text based dialogues, such as classifiers for sentiment polarity, trends clustering of online messages and hate speech detection. In this work we present the design and the implementation an automatic classification tool for the evaluation of the complexity of Italian texts as understood by a speaker of Italian as a second language. The classification is done within the Common European Framework of Reference for Languages (CEFR) which aims at classifying speakers language proficiency. Results of preliminary experiments on a data set of real texts, annotated by experts and used in actual CEFR exam sessions, show a strong ability of the proposed system to label texts with the correct language proficiency class and a great potential for its integration in learning tools, such systems supporting examiners in tests design and automatic evaluation of writing abilities.

Text Classification for Italian Proficiency Evaluation / Alfredo Milani; Stefania Spina; Valentino Santucci; Luisa Piersanti; Marco Simonetti; Giulio Biondi. - ELETTRONICO. - 11619:(2019), pp. 830-841. (Intervento presentato al convegno International Conference on Computational Science and Its Applications tenutosi a Saint Petersburg, Russia nel 01/07/2019 - 04/07/2019) [10.1007/978-3-030-24289-3_61].

Text Classification for Italian Proficiency Evaluation

Marco Simonetti;Giulio Biondi
2019

Abstract

NLP technologies and components have an increasing diffusion in mass analysis of text based dialogues, such as classifiers for sentiment polarity, trends clustering of online messages and hate speech detection. In this work we present the design and the implementation an automatic classification tool for the evaluation of the complexity of Italian texts as understood by a speaker of Italian as a second language. The classification is done within the Common European Framework of Reference for Languages (CEFR) which aims at classifying speakers language proficiency. Results of preliminary experiments on a data set of real texts, annotated by experts and used in actual CEFR exam sessions, show a strong ability of the proposed system to label texts with the correct language proficiency class and a great potential for its integration in learning tools, such systems supporting examiners in tests design and automatic evaluation of writing abilities.
2019
Computational Science and Its Applications - ICCSA 2019
International Conference on Computational Science and Its Applications
Saint Petersburg, Russia
01/07/2019 - 04/07/2019
Alfredo Milani; Stefania Spina; Valentino Santucci; Luisa Piersanti; Marco Simonetti; Giulio Biondi
File in questo prodotto:
File Dimensione Formato  
ItalianoL2+(1).pdf

accesso aperto

Tipologia: Versione finale referata (Postprint, Accepted manuscript)
Licenza: Creative commons
Dimensione 402.99 kB
Formato Adobe PDF
402.99 kB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1293243
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 4
social impact