methodology for the comparison of human judgments with metrics for coreference resolution
Mariya Borovikova  1, *@  , Loïc Grobol  1, 2, *@  , Anaïc Lefeuvre-Halftermeyer  1@  , Sylvie Billot  1@  
1 : Laboratoire dÍnformatique Fondamentale dÓrléans
Université d'Orléans : EA4022, Institut National des Sciences Appliquées - Centre Val de Loire : EA4022, Institut National des Sciences Appliquées
2 : Modèles, Dynamiques, Corpus
Université Paris Nanterre : UMR7114, Centre National de la Recherche Scientifique : UMR7114
* : Auteur correspondant

We propose a method for investigating the
interpretability of metrics used for the corefer-
ence resolution task through comparisons with
human judgments. We provide a corpus with
annotations of different error types and human
evaluations of their gravity. Our preliminary
analysis shows that metrics considerably
overlook several error types and overlook
errors in general in comparison to humans.
This study is conducted on French texts, but
the methodology is language-independent.


Personnes connectées : 2 Vie privée
Chargement...