asr-evaluation icon indicating copy to clipboard operation
asr-evaluation copied to clipboard

Sphinx format problem when using -id argument

Open PanosAntoniadis opened this issue 4 years ago • 1 comments

In Sphinx format the hypothesis file has the following form:

hypothesis_text (file_id score)

while transcriptions:

transcription_text (file_id)

So when I run the wer command the following error occurs:

$ wer transcriptions hypothesis -id
Reference and hypothesis IDs do not match! ref="(data_005)" hyp="-7716)"
File lines in hyp file should match those in the ref file.

I think this occurs because you have not take into account the score parameter. It compares the file id of the transcriptions to the score instead of the file id of the hypothesis.

PanosAntoniadis avatar Jul 14 '19 14:07 PanosAntoniadis

Hi @PanosAntoniadis , it's been a while since I looked at the Sphinx output files, and it's possible they've changed in the meantime. Perhaps Sphinx has an option to not output the scores? Or maybe you'd like to submit a PR with a fix?

belambert avatar Aug 30 '19 03:08 belambert