sherpa icon indicating copy to clipboard operation
sherpa copied to clipboard

decoded text not similar

Open laishramrahul opened this issue 1 year ago • 3 comments

I have built models based on conformer-ctc librispeech. I am comparing the decoded text of the test set using sherpa offline_ctc_asr. The decoded text are not exactly similar for the same file. I want to get the exact same decoded text, please help.

laishramrahul avatar Jun 16 '23 18:06 laishramrahul

Do you use the same decoding method? Does this happen for all files (i.e. the WERs of a bunch of files are worse) or just for one wav?

pkufool avatar Jun 17 '23 04:06 pkufool

The current model is trained for 0-19 epoch.

The test files are decoded using "./conformer_ctc/decode.py --epoch 19 --avg 1 --exp-dir conformer_ctc/exp".

The 19th epoch model is exported with "python conformer_ctc/export.py --epoch 19 --avg 1 --exp-dir conformer_ctc/exp --lang-dir data/lang_bpe_500 --jit 1" to be used with sherpa

The exported model is used with "./sherpa/bin/offline_ctc_asr.py --nn-model conformer_ctc/exp/cpu_jit.pt --tokens data/lang_bpe_500/tokens.txt --use-gpu false --HLG data/lang_bpe_500/HLG.pt --lm-scale 5.0 audio_files/1000000194.wav", I have checked with different values of --lm-scale on few different files but the decoded text given by decode.py and offline_ctc_asr.py are not same.

laishramrahul avatar Jun 21 '23 11:06 laishramrahul

Could you post the decoding logs of ./conformer_ctc/decode.py --epoch 19 --avg 1 --exp-dir conformer_ctc/exp and ./sherpa/bin/offline_ctc_asr.py --nn-model conformer_ctc/exp/cpu_jit.pt --tokens data/lang_bpe_500/tokens.txt --use-gpu false --HLG data/lang_bpe_500/HLG.pt --lm-scale 5.0 audio_files/1000000194.wav so we can compare the decoding configuration.

pkufool avatar Jun 22 '23 08:06 pkufool