icefall icon indicating copy to clipboard operation
icefall copied to clipboard

In wenetspeech recipe, fast_beam_search_LG almost always get worse WER result than greedy search!

Open zhangzhengyireal opened this issue 2 years ago • 2 comments

Collecting environment information... k2 version: 1.24.3 Build type: Release Git SHA1: 42e92fdd4097adcfe9937b4d2df7736d227b8e85 Git date: Wed Jun 28 09:50:36 2023 Cuda used to build k2: 11.6 cuDNN used to build k2: 8.2.0 Python version used to build k2: 3.9 OS used to build k2: Ubuntu 20.04.6 LTS CMake version: 3.26.4 GCC version: 7.5.0 PyTorch version used to build k2: 1.13.1+cu116 PyTorch is using Cuda: 11.6 NVTX enabled: True With CUDA: True Disable debug: True Sync kernels : False Disable checks: False Max cpu memory allocate: 214748364800 bytes (or 200.0 GB) k2 abort: False

Resource: https://huggingface.co/pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615 Testset: wenetspeech/ DEV

Bash command: exp_dir=download/huggingface/icefall-asr-zipformer-streaming-wenetspeech-20230615/exp lang_dir=download/huggingface/icefall-asr-zipformer-streaming-wenetspeech-20230615/data/lang_char decode_method=greedy_search #decode_method=fast_beam_search_LG ./zipformer/decode.py
--epoch ${ep}
--avg ${avg}
--exp-dir ${exp_dir}/
--lang-dir ${lang_dir}
--max-duration 800
--decoding-method ${decode_method}
--blank-penalty ${blank_penalty}
--ngram-lm-scale ${nls}
--ilme-scale ${ilme_scale}
--manifest-dir data/fbank/
--causal 1
--chunk-size ${chunk_size}
--left-context-frames ${left_context}

Result: FXwkpuzV47

In both chunk=16 and chunk=32, I can't get better WER by fast_beam_search_LG.

zhangzhengyireal avatar Aug 11 '23 02:08 zhangzhengyireal

Have you tried the LODR method? Also, assuming your LG is based on Chinese words, what is the vocabulary coverage of your dev set like?

danpovey avatar Aug 11 '23 02:08 danpovey

In my experiments, I have always found the "nbest" variations to be better than the one best versions, e.g., fast_beam_search_nbest_LG better than fast_beam_search_LG.

Usually, you would also need to play around with the --beam parameter to balance out insertions vs. deletions. It looks like you have significantly higher deletions at the moment, maybe you can try increasing the beam.

desh2608 avatar Aug 25 '23 14:08 desh2608