Some short options are broken
When executing ./multeval.sh eval -R example/refs.test2010.lc.tok.en.0 --hyps-baseline example/hyps.lc.tok.en.baseline.opt0 --metrics ter I get the following error:
Failed to specify required options: [refs]
But when changing -R to --refs everything works as expected.
Other flags such as -t or -d are also affected, they do not change the behavior of multeval.
I do not know, if every short flag is broken.
It looks like -P is also affected, and this does change the behavior of multeval from what is expected (see table below)... Sorry this comment is so long; I wrote most of it up before seeing this existing ticket.
tl;dr
Use --ter.punctuation instead of -P
Steps to reproduce in commit bd93ed, using the following command (default setting is false, from TER.java - alternatively, you can instrument the value of punctuation)
$ ./multeval.sh eval --refs example/refs.test2010.lc.tok.en.0 \
> --hyps-baseline example/hyps.lc.tok.en.baseline.opt0 \
> --metrics ter \
> $punctFlag $punctOption
And here are the resulting TER scores, for command-line arguments. Only --ter.punctuation true seems to have any effect.
$punctFlag |
$punctOption |
Result |
|---|---|---|
baseline 65.5 (0.4/*/-) |
||
-P |
baseline 65.5 (0.4/*/-) |
|
-P |
true |
baseline 65.5 (0.4/*/-) <-- should be 68.6? |
-P |
false |
baseline 65.5 (0.4/*/-) |
-P=true |
baseline 65.5 (0.4/*/-) |
|
-Ptrue |
baseline 65.5 (0.4/*/-) |
|
--ter.punctuation |
1 |
baseline 65.5 (0.4/*/-) |
--ter.punctuation |
true |
baseline 68.6 (0.4/*/-) <-- |
I think all of them are broken actually.