We present the first annotated corpus for multilingual analysis of potentially unfair clauses in online Terms of Service. The data set comprises a total of 100 contracts, obtained from 25 documents annotated in four different languages: English, German, Italian, and Polish. For each contract, potentially unfair clauses for the consumer are annotated, for nine different unfairness categories. We show how a simple yet efficient annotation projection technique based on sentence embeddings could be used to automatically transfer annotations across languages.
A Corpus for Multilingual Analysis of Online Terms of Service / Drazewski K.; Galassi A.; Jablonowska A.; Lagioia F.; Lippi M.; Micklitz H.W.; Sartor G.; Tagiuri G.; Torroni P.. - ELETTRONICO. - (2021), pp. 1-8. (Intervento presentato al convegno 3rd Natural Legal Language Processing, NLLP 2021 tenutosi a dom nel 2021).
A Corpus for Multilingual Analysis of Online Terms of Service
Lippi M.;
2021
Abstract
We present the first annotated corpus for multilingual analysis of potentially unfair clauses in online Terms of Service. The data set comprises a total of 100 contracts, obtained from 25 documents annotated in four different languages: English, German, Italian, and Polish. For each contract, potentially unfair clauses for the consumer are annotated, for nine different unfairness categories. We show how a simple yet efficient annotation projection technique based on sentence embeddings could be used to automatically transfer annotations across languages.I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.