fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

Poor performance in Chinese

Open zhhl9101 opened this issue 2 years ago • 12 comments

❓ Questions and Help

Hello MMS,

I run test on cmn-script_simplified with MMS-1B-all model, got 44% WER, which is an unacceptable result.

Audio: NCYzUhAtZNI_0066.zip

What i want: "而且显得皮肤好白哟就这支颜色会显得你夏天的时候特别的有气质而且会很亮眼就是人群当中第一眼就会看到你" What i got: "而些显的皮复好摆奥这这纪人丝会显得你下天的收特别的有气质而且会很量演人取当中地影对看到你"

What changes do I need to make to make the result better ?

Thanks!

zhhl9101 avatar Aug 28 '23 10:08 zhhl9101

Hi, I would recommend to run the decoding with Language Model to get better accuracy.

vineelpratap avatar Sep 06 '23 21:09 vineelpratap

Hi, I would recommend to run the decoding with Language Model to get better accuracy.

Thanks, but I have not found Chinese LM in https://huggingface.co/facebook/mms-cclms/tree/main/lms. Does this mean I need to train the LM myself ?

zhhl9101 avatar Sep 07 '23 01:09 zhhl9101

Looks like HuggingFace has a 50GB limit on the models. I'll upload the model on S3 and share the link here soon.

vineelpratap avatar Sep 07 '23 18:09 vineelpratap

Looks like HuggingFace has a 50GB limit on the models. I'll upload the model on S3 and share the link here soon.

Exciting to hear that, thanks a lot !

If convenient, can you share example how to use LM model? ASR multiple audios in a 'for' loop by loading the model once is expected way instead of one audio with once loading, which is time-costing.

zhhl9101 avatar Sep 07 '23 19:09 zhhl9101

Please see the instructions here on how to download the model and run them - https://huggingface.co/facebook/mms-cclms

vineelpratap avatar Sep 08 '23 19:09 vineelpratap

Please see the instructions here on how to download the model and run them - https://huggingface.co/facebook/mms-cclms

Error when download cmn LM file: image

zhhl9101 avatar Sep 09 '23 06:09 zhhl9101

image

zhhl9101 avatar Sep 14 '23 06:09 zhhl9101

@vineelpratap Hello, do you have any ideas about error above: "This model has order 20 but KenLM was compiled to support up to 6." ? Thanks.

zhhl9101 avatar Sep 19 '23 09:09 zhhl9101

Hi, you would have to rebuild the kenlm on your machine to support order 20.

vineelpratap avatar Sep 19 '23 19:09 vineelpratap

Hi, you would have to rebuild the kenlm on your machine to support order 20.

Thanks, can you share guide about how to rebuild the kenlm ?

zhhl9101 avatar Sep 20 '23 10:09 zhhl9101

Hello @vineelpratap, Could you kindly share the rebuided LM directly ? or the way how to rebuild the kenlm ? For model users, rebuilding is not a comfortable way, and it is difficult to do this for me. I want to express my gratitude once again.

zhhl9101 avatar Oct 08 '23 02:10 zhhl9101