Flashlight and Pyctcdecode decoders
Preserve Flashlight and Pyctcdecode beamsearch with Ngram LM
Support Flashlight and Pyctcdecode decoding with pure KenLM and NeMo KenLM Standardize API of CLI inference scripts
Collection: ASR
Changelog
- Fix install script install_beamsearch_decoders.sh
- Create flashlight_lexicon file during scripts/asr_language_modeling/ngram_lm/train_kenlm.py and tar it with kenlm.bin
- Unify parameters for eval_beamsearch_ngram_ctc.py, speech_to_text_eval.py and training -- Get logprobs from Hypothesis -- Use "pyctcdecode" strategy as default beamsearch algorithm denoted as "beam" -- Remove default seq2seq strategy -- Check decoding_type and search_type combinations -- Support empty string in nemo_kenlm_path and word_kenlm_path for beamsearch without LM (ZeroLM)
- Fix bug with EncDecHybridRNNTCTCModel in examples/asr/transcribe_speech.py
- Support AggregateTokenizer in scripts/asr_language_modeling/ngram_lm/create_lexicon_from_arpa.py
python3 scripts/asr_language_modeling/ngram_lm/eval_beamsearch_ngram_ctc.py \
model_path=am_model.nemo \
dataset_manifest=manifest.json \
preds_output_folder=/tmp \
ctc_decoding.strategy=flashlight \
ctc_decoding.beam.kenlm_path=am_model.kenlm \
ctc_decoding.beam.beam_size=[4] \
ctc_decoding.beam.beam_alpha=[0.5] \
ctc_decoding.beam.beam_beta=[0.5] \
batch_size=32 \
beam_batch_size=1 \
cuda=1
python3 examples/asr/speech_to_text_eval.py \
model_path=am_model.nemo \
dataset_manifest=manifest.json \
decoder_type=ctc
ctc_decoding.strategy=flashlight \
ctc_decoding.beam.nemo_kenlm_path=kenlm_model.bin \
ctc_decoding.beam.beam_size=4 \
ctc_decoding.beam.beam_alpha=0.5 \
ctc_decoding.beam.beam_beta=0.5 \
ctc_decoding.beam.flashlight_cfg.lexicon_path=am_model.flashlight_lexicon \ # DEFAULT_TOKEN_OFFSET
ctc_decoding.beam.return_best_hypothesis=true \
batch_size=32 \
output_filename=/tmp/manifest_out.json
cuda=1
PR Type:
- [ V] New Feature
- [ ] Bugfix
- [ ] Documentation
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Additional Information
- Related to #9067
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.
This PR was closed because it has been inactive for 7 days since being marked as stale.
@karpnv is this being worked on ?
@karpnv i'll provide a review later this week (bandwidth limited)
@titu1994 I covered half of the table and PR already huge. let's review it first. Then will continue with AggregateTokenizer
Jenkins
Note: eval_beamsearch_ngram_ctc.py and transducer.py I guess also requires changes for hypothesis that was updated recently for log probs.
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.
This PR was closed because it has been inactive for 7 days since being marked as stale.