Vladimir Bataev comments

Results 24 comments of


                                            Vladimir Bataev

ASR Context Biasing for EncDecHybridRNNTCTCModel (parakeet tdt 0.6b v3)

@sandorkonya With Parakeet-TDT-0.6b-v3 you can use new phrase boosing via setting ```shell python examples/asr/transcribe_speech.py \ \ rnnt_decoding.strategy="greedy_batch" \ rnnt_decoding.greedy.boosting_tree.key_phrases_file=${KEY_WORDS_LIST} \ rnnt_decoding.greedy.boosting_tree.context_score=1.0 \ rnnt_decoding.greedy.boosting_tree.depth_scaling=2.0 \ rnnt_decoding.greedy.boosting_tree_alpha=${BT_ALPHA} ``` See details in https://github.com/NVIDIA-NeMo/NeMo/pull/14277

ASR Context Biasing for EncDecHybridRNNTCTCModel (parakeet tdt 0.6b v3)

@abentabib Yes, it works on both CPUs and GPUs. A pure PyTorch implementation is used, when Triton/CUDA are unavailable (on GPU we can use a more efficient Triton kernel).

ASR Context Biasing for EncDecHybridRNNTCTCModel (parakeet tdt 0.6b v3)

At first glance, everything should work. Several notes: - `nvidia/parakeet-tdt-0.6b-v3` is a case-sensitive model, so `keywords.txt` should contain words in the desired case (maybe multiple spellings) - `boosting_tree_alpha = 0.5`...

Punctuation Marks in Timestamps

@monica-sekoyan can you add some tests, please?