evaluate CER Normalization

I'm not finding in the documentation a method to normalize CER metric. Do you have this implemented?

cer.features
OUTPUT:
{'predictions': Value(dtype='string', id='sequence'),
 'references': Value(dtype='string', id='sequence')}

df[df['cer'] == 1.25]
OUTPUT:
GT | OCR_Prediction | wer | cer
2.49 | 749.00 | 1.0 | 1.25
2.49 | 749.00 | 1.0 | 1.25
2.50 | 1950.00 | 1.0 | 1.25

Jun 24 '22 15:06 Coliverfelt

Hi @Coliverfelt, not sure what you mean. You can find the implementation details here: https://huggingface.co/spaces/evaluate-metric/cer/blob/main/cer.py

Jun 24 '22 16:06 lvwerra

Hi @lvwerra! My point is that when CER metric is above 1.0, like in this example:

reference: 2.49 prediction: 749.00 cer: 1.25

substitute= 1 delete = 1 insert = 3 hits = 2 reference length = 4

CER formula is substitute + delete + insert / reference length and the result is equal to 1.25.

But in this case that CER is greater than 1.00 we should apply normalization. So the normalized result should be:

CER normalization formula is substitute + delete + insert / substitute + delete + insert + hits so the result is equal to 0.7142857142857143.

And when I'm running:

cer.compute(predictions=[['2.49'], references=[['749.00']]

The result isn't normalized.

Did I make myself clearer?

Jun 27 '22 03:06 Coliverfelt

I couldn't find any reference for normalized CER do you have a link? We could add this as an optional config. E.g.

cer.compute(references=references, predictions=predictions, normalized=True)

What do you think?

Jul 07 '22 09:07 lvwerra

I think this way could work well! I'm fowarding the link with the CER normalization content and a image with it's explanation. https://towardsdatascience.com/evaluating-ocr-output-quality-with-character-error-rate-cer-and-word-error-rate-wer-853175297510#5aec

Jul 26 '22 11:07 Coliverfelt

Sounds good, do you want to take a stab at it?

Aug 03 '22 14:08 lvwerra