TIMIT typically reports PER, not WER

Open pzelasko opened this issue 3 years ago • 0 comments

The docs here mention that TIMIT reports WER, but this dataset typically serves as a benchmark for phone error rate (PER), because it’s one of the few resources that have manually annotated phone segments. I recommend to fix and clarify that in the README:

https://github.com/huggingface/evaluate/blob/c1141b02941dc508ca4560b9dfe5b7a90f4cf785/metrics/wer/README.md?plain=1#L68

I think it would be good to have a clear difference between word/character/phone/token error rate (WER/CER/PER/TER) at the library level.

May 31 '22 15:05 pzelasko