jiwer
jiwer copied to clipboard
Could someone please explain what is difference between wer(ref, hypo) when I concatenate string and wer(ref_list, hypo_list)?
I am trying to calculate wer between model transcriptions and ground truths but am getting too much difference in WER, sometime even surpassing 5%! The wer using list gives better wer than concatenated string. Could someone be kind enough to explain the difference and which is better to use? Thank you!
You can try to visualize the alignment and spot why the amount of sub/del/ins decrease when you concatenate strings.
See https://jitsi.github.io/jiwer/reference/alignment/#alignment.visualize_alignment.
Generally, if the sentences are independent, you should not concatenate them.