spec
spec copied to clipboard
GT-level specific metrics
Originally posted by @bertsky in https://github.com/OCR-D/spec/pull/225#discussion_r1086173671
Speaking of: IMHO it would be quite relevant to offer a CER metric under level-2 (or even level-1) equivalency. Not exclusively (because this is not standard), but as a complementary variant.
Either by normalising both sides to OCR-D GT level 2 (or 1). Or by passing equivalence classes (zero edit cost rules) to the distance metric.
For example, naively, an umlaut error (
uinstead ofüoruͤ), or a punctuation error ("instead of“), will have the same cost as any other error. But they might not be as relevant as others.