sherpa-onnx
sherpa-onnx copied to clipboard
Help for exporting FastConformer NeMo model to onnx for use in sherpa-onnx for streaming inference
Hello!
I was trying to convert the latest FastConformer Model from Nvidia for use in Sherpa-onnx called 'STT En FastConformer Hybrid Transducer-CTC Large Streaming Multi'
Model link: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_hybrid_large_streaming_multi
I used this script available in hugging face (https://huggingface.co/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-large/tree/main) to export it:
m = nemo_asr.models.ASRModel.from_pretrained('stt_en_fastconformer_hybrid_large_streaming_multi') m.export('model.onnx')
with open('tokens.txt', 'w') as f:
for i, s in enumerate(m.decoder.vocabulary):
f.write(f"{s} {i}\n")
f.write(f"
But I get an error:
AttributeError Traceback (most recent call last)
/Users/Downloads/Online_ASR_Microphone_Demo_Cache_Aware_Streaming.ipynb Cell 11 line 5
2 m.export('model.onnx')
4 with open('tokens.txt', 'w') as f:
----> 5 for i, s in enumerate(m.decoder.vocabulary):
6 f.write(f"{s} {i}\n")
7 f.write(f"
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py:1688, in Module.getattr(self, name) 1686 if name in modules: 1687 return modules[name] -> 1688 raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")
AttributeError: 'RNNTDecoder' object has no attribute 'vocabulary'
However during exporting it, I was able to generate 3 files
encoder-model.onnx - 456.7 MB decoder_joint-model.onnx - 21.3 MB tokens.txt - 0 bytes
any advice on how I can proceed to export and use this model for Streaming inference in streaming_server.py?
hi,
you can do this.
Vocabulary object is present withing model.joint
and not model.decoder
with open(onnx_model_path + '/tokens.txt', 'w', encoding='utf-8') as f:
for i, s in enumerate(model.joint.vocabulary):
f.write(f"{s} {i}\n")
f.write(f"<blk> {i+1}\n")
Thanks for the response @sangeet2020
I was able to export it and get 3 generated files
- encoder-model.onnx
- decoder_joint-model.onnx
- tokens.txt
There is no joiner in the export and when using this with any online decoding file like speech-recognition-with-microphone-with-endpoint-detection.py I get this error:
My command: python3 speech-recognision-from-mircophone-with-endpoint-detection.py --encoder=encoder-model.onnx --decoder=decoder_joint-model.onnx --tokens=tokens.txt --joiner=decoder_joint-model.onnx (I assume joiner to be the same as decoder)
Error: /Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/online-transducer-model.cc:GetModelType:62 No model_type in the metadata! Please make sure you are using the latest export-onnx.py from icefall to export your transducer models /Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/online-transducer-model.cc:Create:116 Unknown model type in online transducer! zsh: segmentation fault python3 speech-recognision-from-mircophone-with-endpoint-detection.py
Any idea on how to get it to work or do we need to wait till official update is made on integrating NeMo streaming models with sherpa-onnx
Remember that Sherpa has support for EncDecCTCModelBPE
models, and not EncDecHybridRNNTCTCBPEModel
The model that you are trying stt_en_fastconformer_hybrid_large_streaming_multi
is EncDecHybridRNNTCTCBPEModel
.
@csukuangfj do you think, for decoding EncDecHybridRNNTCTCBPEModel
models from Nemo, do we need a completely new implementation? or can we use implementations for EncDecCTCModelBPE
models?
I am working on it. Will finish it this work.
I will add the CTC support first.
Thank you. thats good. I am also working on the implementation of the RNNT decoding, but have been stuck for a while. Some errors, I cant get past through. Shall I create a PR?
Some errors, I cant get past through. Shall I create a PR?
Thanks! It would be great if you can share your code.
Thanks for supporting NeMo hybrid model export !
Side note, we fuse the decoder joint for export during RNNT export so that might complicate usage
Some errors, I cant get past through. Shall I create a PR?
Thanks! It would be great if you can share your code.
Sure. Things are a bit messy now with the c++ implementation. I will cleanup and create a PR.
To give you an overview, I am trying to replicate the transducer based implementation under sherpa-onnx/csrc/
for the RNNT decoder in Nemo's EncDecHybridRNNTCTCBPEModel
. First I am aiming for an offline decoding, and then online streaming decoding. Is this the right direction?
thank you
Thanks for supporting NeMo hybrid model export !
Side note, we fuse the decoder joint for export during RNNT export so that might complicate usage
Hi @titu1994 ,
yeah, I kind of saw it in the Nemo's source code for model export.
But, I do this instead for model export
import nemo.collections.asr as nemo_asr
model_fname="stt_en_fastconformer_hybrid_large_streaming_multi_rnnt.nemo"
onnx_model_path = 'onnx_model/'+ model_fname.split(".")[0]
!mkdir -p {onnx_model_path}
model = nemo_asr.models.EncDecCTCModelBPE.restore_from('model/'+ model_fname)
# Export ONNX model
onnx_enc_model_fname = onnx_model_path + "/" + 'encoder.onnx'
onnx_dec_model_fname = onnx_model_path + "/" + 'decoder.onnx'
onnx_joint_model_fname = onnx_model_path + "/" + 'joint.onnx'
model.encoder.export(onnx_enc_model_fname)
model.decoder.export(onnx_dec_model_fname)
model.joint.export(onnx_joint_model_fname)
Shouldnt this work as well? or no?
Thank You
Shouldnt this work as well? or no?
Yes, this approach also works perfectly.
To give you an overview, I am trying to replicate the transducer based implementation under sherpa-onnx/csrc/ for the RNNT decoder in Nemo's EncDecHybridRNNTCTCBPEModel. First I am aiming for an offline decoding, and then online streaming decoding. Is this the right direction?
Yes, that looks good to me. Looking forward to your contribution!
Note that the decoder model from NeMo is stateful, not stateless.
Please use the export script from https://github.com/k2-fsa/sherpa-onnx/pull/844
You can refer to the following script https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/nemo/fast-conformer-hybrid-transducer-ctc/test-onnx-transducer.py for how decoding is done for streaming transducer from NeMo.
By the way, to export a non-streaming model, I think you need to remove "cache_support": True
from
https://github.com/k2-fsa/sherpa-onnx/blob/68b25abf2712a4e6b5fa5f846fd9c23f72f5f860/scripts/nemo/fast-conformer-hybrid-transducer-ctc/export-onnx-transducer.py#L88
Please ask at any time if you have any issues.
Thanks for supporting NeMo hybrid model export !
Side note, we fuse the decoder joint for export during RNNT export so that might complicate usage
We have exported the decoder and joiner separately in https://github.com/k2-fsa/sherpa-onnx/pull/844
By the way, is there any plan to use a stateless decoder model in NeMo? @titu1994