kaldi icon indicating copy to clipboard operation
kaldi copied to clipboard

Cannot reproduce the RESULTS(wer:43.25%) of S5b_track1 in Chime6

Open dospeech opened this issue 3 years ago • 10 comments

When I run the run.sh baseline of S5b_track1 in Chime6, I can't get the wer:43.25% of RESULTS,and I get the best result is wer:45.48% . wer-details: %WER 45.48 [ 26779 / 58881, 2332 ins, 11138 del, 13309 sub ] exp/chain_train_worn_simu_u400k_cleaned_rvb/tdnn1b_cnn_sp/decode_dev_gss_multiarray_2stage/wer_9_0.0

is there something wrong with my result?

dospeech avatar Feb 22 '22 09:02 dospeech

@desh2608 might be able to help with this.

danpovey avatar Feb 22 '22 09:02 danpovey

It's hard to say what may be going wrong just based on WER. Did you change any hyperparameters? Did you look at the intermediate results (e.g. from GMM decoding) to see if they match those in the recipe?

desh2608 avatar Feb 22 '22 19:02 desh2608

It's hard to say what may be going wrong just based on WER. Did you change any hyperparameters? Did you look at the intermediate results (e.g. from GMM decoding) to see if they match those in the recipe?

THE wer of GMM decoding (tri3 )is 83.75%(baseline is 85.72 %), I just changed the number of gpu (gpu_num=4)and didn't change any hyperparameters.

Whether the result of baseline (wer:43.25%) is generated by using RNNLM-rescore?

dospeech avatar Feb 23 '22 13:02 dospeech

just a note -- it might be just a difference caused by different random initialization. y.

On Wed, Feb 23, 2022 at 2:15 PM dospeech @.***> wrote:

It's hard to say what may be going wrong just based on WER. Did you change any hyperparameters? Did you look at the intermediate results (e.g. from GMM decoding) to see if they match those in the recipe?

THE wer of GMM decoding (tri3 )is 83.75%(baseline is 85.72 %), I just changed the number of gpu (gpu_num=4)and didn't change any hyperparameters.

Whether the result of baseline (wer:43.25%) is generated by using RNNLM-rescore?

— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/4701#issuecomment-1048769728, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYX5ZOYF5F5FENRJ4DWDU4TMXFANCNFSM5PA2CYTA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

jtrmal avatar Feb 23 '22 13:02 jtrmal

I think so, yeah. The WER without RNNLM rescoring should be closer to 46%. See this at the top of the run_cnn_tdnn_1b.sh: %WER 46.07 [ 27124 / 58881, 2905 ins, 9682 del, 14537 sub ] exp/chain_train_worn_simu_u400k_cleaned_rvb/tdnn1b_cnn_sp/decode_dev_gss_multiarray_2stage/wer_10_0.0

Could you try with the RNNLM rescoring?

desh2608 avatar Feb 23 '22 13:02 desh2608

I think so, yeah. The WER without RNNLM rescoring should be closer to 46%. See this at the top of the run_cnn_tdnn_1b.sh: %WER 46.07 [ 27124 / 58881, 2905 ins, 9682 del, 14537 sub ] exp/chain_train_worn_simu_u400k_cleaned_rvb/tdnn1b_cnn_sp/decode_dev_gss_multiarray_2stage/wer_10_0.0

Could you try with the RNNLM rescoring?

I‘ve tried with the RNNLM rescoring, and the WER is 43.72%,it looks closer to 43.25(baseline). Is there any other optimizations?

dospeech avatar Feb 25 '22 07:02 dospeech

You can try tuning some of the hyperparameters (esp. learning rate) since you changed the number of training jobs (GPUs). But I think at this point you're close enough that the difference is statistically insignificant.

desh2608 avatar Feb 25 '22 14:02 desh2608

You can try tuning some of the hyperparameters (esp. learning rate) since you changed the number of training jobs (GPUs). But I think at this point you're close enough that the difference is statistically insignificant.

OK,thank you for your answer. Besides,is there the WER ( baseline )on the evaluation set ?

dospeech avatar Feb 25 '22 14:02 dospeech

Sorry, I don't think I have the eval numbers for that exact recipe on hand. We tried several systems during the challenge (see Table 7 in https://arxiv.org/pdf/2006.07898.pdf) and it seems the eval WER tracks dev. It would be great if you could make a PR with the number if you run it.

desh2608 avatar Feb 25 '22 16:02 desh2608

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

stale[bot] avatar Apr 27 '22 10:04 stale[bot]