coval Perl and Python implementation give different scores with duplicates

Perl and Python implementation give different scores with duplicates

Open andreasvc opened this issue 5 years ago • 0 comments

Take a file with duplicates:

#begin document (Dups);
test1   0       0       a1      (0)|(1)
test1   0       1       a2      -
test1   0       2       junk    -
test1   0       3       b1      (1)
test1   0       4       b2      -
test1   0       5       b3      -
test1   0       6       b4      -
test1   0       7       jnk     -
test1   0       8       .       -

#end document

Cf. dups.txt

The two LEA implementations give different scores:

andreas@thinkpad:~/code/coval/% ~/src/reference-coreference-scorers/scorer.pl lea /tmp/dups.txt /tmp/dups.txt
version: 9.0.0-alpha /home/andreas/src/reference-coreference-scorers/lib/CorScorer.pm
====> (Dups);:
File (Dups);:
Entity 0: (0,0)
Entity 1: (0,0) (3,3)
====> (Dups);:
File (Dups);:
Entity 0: (0,0)
Entity 1: (0,0) (3,3)
(Dups);:
Repeated mention in the key: 0, 0 01
Repeated mention in the response: 0, 0 11
Total key mentions: 2
Total response mentions: 2
Strictly correct identified mentions: 2
Partially correct identified mentions: 0
No identified: 0
Invented: 1
Recall: (1 / 3) 33.33%  Precision: (0 / 2) 0%   F1: 0%
--------------------------------------------------------------------------

====== TOTALS =======
Identification of Mentions: Recall: (2 / 2) 100%        Precision: (2 / 2) 100% F1: 100%
--------------------------------------------------------------------------
Coreference: Recall: (1 / 3) 33.33%     Precision: (0 / 2) 0%   F1: 0%
--------------------------------------------------------------------------
andreas@thinkpad:~/code/coval/% python3 scorer.py /tmp/dups.txt /tmp/dups.txt 
Warning: A single mention is assigned to more than one cluster: [0, 1]
Warning: A single mention is assigned to more than one cluster: [0, 1]
             recall  precision         F1
mentions     100.00     100.00     100.00
muc          100.00     100.00     100.00
bcub         100.00     100.00     100.00
ceafe        100.00     100.00     100.00
ceafm        100.00     100.00     100.00
lea           66.67      66.67      66.67
CoNLL score: 100.00

One would presume that all scores should be 100%, don't you agree?

Dec 02 '19 21:12 andreasvc

coval coval copied to clipboard

Perl and Python implementation give different scores with duplicates

coval
coval copied to clipboard