jiwer icon indicating copy to clipboard operation
jiwer copied to clipboard

Could someone please explain what is difference between wer(ref, hypo) when I concatenate string and wer(ref_list, hypo_list)?

Open wannasleepforlong opened this issue 11 months ago • 1 comments

I am trying to calculate wer between model transcriptions and ground truths but am getting too much difference in WER, sometime even surpassing 5%! The wer using list gives better wer than concatenated string. Could someone be kind enough to explain the difference and which is better to use? Thank you!

wannasleepforlong avatar Feb 10 '25 14:02 wannasleepforlong

You can try to visualize the alignment and spot why the amount of sub/del/ins decrease when you concatenate strings.

See https://jitsi.github.io/jiwer/reference/alignment/#alignment.visualize_alignment.

Generally, if the sentences are independent, you should not concatenate them.

nikvaessen avatar Feb 10 '25 19:02 nikvaessen