icefall icon indicating copy to clipboard operation
icefall copied to clipboard

Support using the branch from the gigaspeech dataset for decoding

Open csukuangfj opened this issue 3 years ago • 2 comments

WER comparison

test-clean test-other comment
basline 2.78 7.36 --iter468000 --avg 16, greedy search
this PR 3.36 8.20 --iter 468000 --avg 16, greedy search

It performs better if we use the branch from the librispeech for decoding. However, it is not clear how it performs on other dataset if we use the branch from the gigaspeech for decoding.

The decoding results are listed below

test-clean test-other
baseline errs-test-clean-greedy_search-iter-468000-avg-16-context-2-max-sym-per-frame-1-use-averaged-model.txt errs-test-other-greedy_search-iter-468000-avg-16-context-2-max-sym-per-frame-1-use-averaged-model.txt
this PR errs-test-clean-greedy_search-iter-468000-avg-16-use-giga-branch-context-2-max-sym-per-frame-1-use-averaged-model.txt errs-test-other-greedy_search-iter-468000-avg-16-use-giga-branch-context-2-max-sym-per-frame-1-use-averaged-model.txt

csukuangfj avatar Oct 05 '22 06:10 csukuangfj

The pretrained models are uploaded to https://huggingface.co/csukuangfj/icefall-asr-librispeech-lstm-transducer-stateless2-2022-09-03/tree/main/exp/giga

csukuangfj avatar Oct 05 '22 06:10 csukuangfj

My wife was saying then when she says things like "google" on the librispeech output, it was never getting it right. We'll see whether the gigaspeech one is better.

danpovey avatar Oct 05 '22 16:10 danpovey