MLS Docker inference examples
Question
To provide examples of inference with MLS pretrained tokens&lexicon and acoustic and language models showing
- Which docker image shall be used
- How to pass the models and lexicon to the run command
- how to run in both CPU and GPU
Additional Context
Currently a example command to run a wav2letter inference with the latest docker image is the following
sudo docker run --rm -v ~:/root/host/ -it --ipc=host --name w2l -a stdin -a stdout -a stderr wav2letter/wav2letter:inference-latest sh -c "cat /root/host/audio/LibriSpeech/dev-clean/777/126732/777-126732-0070.flac.wav | /root/wav2letter/build/inference/inference/examples/simple_streaming_asr_example --input_files_base_path /root/host/model"
I have recently built a simpler docker image to run wav2vec inference here It would be cool to have a simple pipeline for MLS/wav2letter as well!
cc @vineelpratap @xuqiantong
Hi,
To run inference please follow the commands here -https://github.com/facebookresearch/wav2letter/tree/master/recipes/mls#decoding using the latest docker from flashlight repo.
We don't provide pre trained models only for offline ASR and not for streaming ASR (https://github.com/facebookresearch/wav2letter/issues/920) and hence simple_streaming_asr_example cannot be used..
@vineelpratap thanks! So as I enter the docker container then I run commands for decoding, but I see two different syntax here. For beam search, we have;
/flashlight/build/bin/asr/fl_asr_decode --flagsfile=decode/[lang].cfg
While for viterbi decoding we have
fl_asr_test --am=[...]/am.bin --lexicon=[...]/train_lexicon.txt --datadir=[...] --test=test.lst --tokens=[...]/tokens.txt --emission_dir='' --nouselexicon --show
Why?
That's true.
fl_asr_test is for viterbi decoding while fl_asr_decode is for beam search decoding with a Language Model. If you just care about getting the best WER, please use the latter.
Hello,
I don't know if my following question is the kind of question that is proper to ask in Github. Still, since I have been fighting the last days with this I decided myself to ask. I am trying to learn how to train a speech recognition system in spanish using Python and I found about wav2letter in the following link https://ai.facebook.com/blog/a-new-open-data-set-for-multilingual-speech-research/, whiche led me here https://github.com/facebookresearch/wav2letter/tree/master/recipes/mls . I downloaded the proper files and I tried to follow the USAGE STEPS in wav2letter/recipes/mls/README.md
- I had no problem doing the preparation of dataset.
- I could not complete the training or decoding steps, even looking at the flashlight project. When in README.md you say that there are dependencies with flashlight, I don't know what does that mean. I tried to download the flashlight project, next to the wav2letter project but I think that is not the solution. I noticed that the flashlight project is written in c++. Does that mean that I need c++ to run this MLS part of the wav2project locally?
Hi, yes, once you build flashlight (https://github.com/facebookresearch/flashlight#building-and-installing), it'll build the binaries for decoding. You can then use the commands mentioned in the MLS recipe to run decoding...
@vineelpratap is it possible to build using the provided Dockerfile here
and then using the MLS recipe to run the decoder in the same way?
Yes, that is also possible~