We study annotation projection in text classification problems where source documents are published in multiple languages and may not be an exact translation of one another. In particular, we focus on the detection of unfair clauses in privacy policies and terms of service. We present the first English-German parallel asymmetric corpus for the task at hand. We study and compare several language-agnostic sentence-level projection methods. Our results indicate that a combination of word embeddings and dynamic time warping performs best.

Cross-lingual Annotation Projection in Legal Texts / Galassi A.; Drazewski K.; Lippi M.; Torroni P.. - ELETTRONICO. - (2020), pp. 915-926. (Intervento presentato al convegno 28th International Conference on Computational Linguistics, COLING 2020 tenutosi a esp nel 2020).

Cross-lingual Annotation Projection in Legal Texts

Lippi M.;
2020

Abstract

We study annotation projection in text classification problems where source documents are published in multiple languages and may not be an exact translation of one another. In particular, we focus on the detection of unfair clauses in privacy policies and terms of service. We present the first English-German parallel asymmetric corpus for the task at hand. We study and compare several language-agnostic sentence-level projection methods. Our results indicate that a combination of word embeddings and dynamic time warping performs best.
2020
COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference
28th International Conference on Computational Linguistics, COLING 2020
esp
2020
Galassi A.; Drazewski K.; Lippi M.; Torroni P.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1356528
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? ND
social impact