whisper_normalizer icon indicating copy to clipboard operation
whisper_normalizer copied to clipboard

A python package for whisper normalizer

Results 7 whisper_normalizer issues
Sort by recently updated
recently updated
newest added

**Is your feature request related to a problem? Please describe.** - Abhijit Neil Abhraham, suggested to dockerize our python package, for easy commercial usage. ![WhatsApp Image 2024-02-24 at 01 21...

**Describe the bug** - Github actions failing. ![image](https://github.com/kurianbenoy/whisper_normalizer/assets/24592806/c25860fb-78e4-42e9-99b9-91d75714e1be)

- Include cleanup module before actual normalization - Include non-standard spelling normalization - How to deal with numbers, spell out cardinals, ordinals, abbreviations etc.

The whisper `BasicTextnormalizer` - https://kurianbenoy.github.io/whisper_normalizer/basic.html#basictextnormalizer seems to be widely proven to be a in almost all multi-lingual languages except English. Checkout: https://github.com/huggingface/transformers/issues/20703 In Malyalam and most of Indic languages, it's...

- Add detailed testcases - Add and test how to use normalizer in each of Indian languages

Hi I tried using your normalizer for help in calculating WER for my personal use case. I have a ground truth like so: ``` JUNE THIRD EIGHTEEN SEVENTY ONE OBOCOCK...

You folks might be interested in https://github.com/google-research/nisaba which has normalization rules for a lot of Indic languages, they have taken an FST based approach to normalization. This is just an...