fairseq
fairseq copied to clipboard
AssertionError: lexicon free decoding can only be done with a unit language model
❓ Questions and Help
Before asking:
- search the issues.
- search the docs.
What is your question?
When I evaluate a CTC model on wav2cec2.0 according to fairseq/examples/wav2vec/README.md
, I encountered the following error:
AssertionError: lexicon free decoding can only be done with a unit language model
Code
Here is the code I'm executing:
subset=test_clean
CUDA_VISIBLE_DEVICES=1 python /home/quhongling/fairseq-main/examples/speech_recognition/infer.py \
/Data/QuHonglin/datasets/wav2vec2/Librispeech/evaluate/100h \
--task audio_finetuning \
--nbest 1 --path /Data/QuHonglin/pre-trained-models/wav2vec_small_100h.pt \
--gen-subset $subset --results-path /Data/QuHonglin/experiments/wav2vec2/Librispeech/evaluate/100h/4-gram-lm/test_clean \
--w2l-decoder kenlm --lm-model /Data/QuHonglin/pre-trained-models/lm_librispeech_kenlm_word_4g_200kvocab.bin \
--lm-weight 2 --word-score -1 --sil-weight 0 --criterion ctc --labels ltr --max-tokens 4000000 \
--post-process letter
And here is the error log:
Traceback (most recent call last):
File "/home/quhongling/fairseq-main/examples/speech_recognition/infer.py", line 436, in <module>
cli_main()
File "/home/quhongling/fairseq-main/examples/speech_recognition/infer.py", line 432, in cli_main
main(args)
File "/home/quhongling/fairseq-main/examples/speech_recognition/infer.py", line 290, in main
generator = build_generator(args)
File "/home/quhongling/fairseq-main/examples/speech_recognition/infer.py", line 279, in build_generator
return W2lKenLMDecoder(args, task.target_dictionary)
File "/home/quhongling/fairseq-main/examples/speech_recognition/w2l_decoder.py", line 179, in __init__
assert args.unit_lm, "lexicon free decoding can only be done with a unit language model"
AssertionError: lexicon free decoding can only be done with a unit language model
What have you tried?
When I try --w2l-decoder viterbi
, it works fine.
When I try to add --unit-lm
or --unit-lm --kenlm-model=/Data/QuHonglin/pre-trained-models/lm_librispeech_kenlm_word_4g_200kvocab.bin
, it can work, but the hypothesises are all the null, resulting in a wer of 100%.
So how do I use a language model to decoding the Wac2vec2.0-CTC model correctly?
What's your environment?
- fairseq Version: main
- PyTorch Version: 1.8.1+cu101
- OS : Linux ubuntu18.04
- How you installed fairseq: source
- Build command you used (if compiling from source):
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./
- Python version: 3.7
- CUDA/cuDNN version: cuda10.1/cudnn-cuda10.1-8.0.5
- GPU models and configuration:
- Any other relevant information:
you need to specify your lexicon file within the command using --lexicon lexicon.txt in your case
subset=test_clean
CUDA_VISIBLE_DEVICES=1 python /home/quhongling/fairseq-main/examples/speech_recognition/infer.py \
/Data/QuHonglin/datasets/wav2vec2/Librispeech/evaluate/100h \
--task audio_finetuning \
--nbest 1 --path /Data/QuHonglin/pre-trained-models/wav2vec_small_100h.pt \
--gen-subset $subset --results-path /Data/QuHonglin/experiments/wav2vec2/Librispeech/evaluate/100h/4-gram-lm/test_clean \
--w2l-decoder kenlm --lm-model /Data/QuHonglin/pre-trained-models/lm_librispeech_kenlm_word_4g_200kvocab.bin \
--lm-weight 2 --word-score -1 --sil-weight 0 --criterion ctc --labels ltr --max-tokens 4000000 \
--post-process letter --lexicon lexicon.txt
same lexicon.txt used in kenlm model it should be like this
EVERY E V E R Y |
WORD W O R D |
THAT T H A T |
EXISTS E X I S T S |
IN I N |
YOUR Y O U R |
LABEL L A B E L |
OR O R |
TRANSCRIPTION T R A N S C R I P T I O N |
FILE F I L E |
WILL W I L L |
WRITE W R I T E |
DOWN D O W N |
LIKE L I K E |
THIS T H I S |
@Abdullah955 Thanks for your reply. But if I want a lexicon free decoding, how should I do?
@quhonglin I think you should get a unit LM, which means that the model use char as unit to build
@quhonglin you need to create your own language model or use a pre-trained one using Kenlm this tutorial should help you,
https://huggingface.co/blog/wav2vec2-with-ngram
you only need text to train your model
Thanks for everyone. But now I no longer use it, and aslo forget some details for this issue. Maybe I'll try again when I have time in the future.
@Abdullah955 I follows the right steps. And the lexicon format is just as above. But when it goes to " self.lm = KenLM(cfg.lmpath, self.word_dict)" .
Segmentation fault (core dumped) occurs.
Can you help me with this? Thanks
The model is data2vec base model.