Appraise icon indicating copy to clipboard operation
Appraise copied to clipboard

Computing clusters with systems with equal output

Open tuetschek opened this issue 8 years ago • 0 comments

How to compute the system ranking clusters if systems often produce the same output and are merged in the results CSV file? Is using the scripts/compute_ranking_clusters.perl script the correct way?

This script seems to ignore merged systems in the results CSV file (sysA+sysB will be treated as a separate, new system). I have fixed it in this commit in my fork. Was that the correct thing to do, or is there a better way of getting the ranking clusters?

( Without this fix, the clustering script would get stuck in an infinite loop on my data, i.e., several variants of the same NLG system, often producing identical outputs. )

tuetschek avatar Feb 07 '17 22:02 tuetschek