sherpa-onnx Help for exporting FastConformer NeMo model to onnx for use in sherpa-onnx for streaming inference

Hello!

I was trying to convert the latest FastConformer Model from Nvidia for use in Sherpa-onnx called 'STT En FastConformer Hybrid Transducer-CTC Large Streaming Multi'

Model link: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_hybrid_large_streaming_multi

I used this script available in hugging face (https://huggingface.co/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-large/tree/main) to export it:

m = nemo_asr.models.ASRModel.from_pretrained('stt_en_fastconformer_hybrid_large_streaming_multi') m.export('model.onnx')

with open('tokens.txt', 'w') as f: for i, s in enumerate(m.decoder.vocabulary): f.write(f"{s} {i}\n") f.write(f" {i+1}\n")

But I get an error:

AttributeError Traceback (most recent call last) /Users/Downloads/Online_ASR_Microphone_Demo_Cache_Aware_Streaming.ipynb Cell 11 line 5 2 m.export('model.onnx') 4 with open('tokens.txt', 'w') as f: ----> 5 for i, s in enumerate(m.decoder.vocabulary): 6 f.write(f"{s} {i}\n") 7 f.write(f" {i+1}\n")

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py:1688, in Module.getattr(self, name) 1686 if name in modules: 1687 return modules[name] -> 1688 raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")

AttributeError: 'RNNTDecoder' object has no attribute 'vocabulary'

However during exporting it, I was able to generate 3 files

encoder-model.onnx - 456.7 MB decoder_joint-model.onnx - 21.3 MB tokens.txt - 0 bytes

any advice on how I can proceed to export and use this model for Streaming inference in streaming_server.py?

Apr 18 '24 18:04 tempops

hi, you can do this. Vocabulary object is present withing model.joint and not model.decoder

with open(onnx_model_path + '/tokens.txt', 'w', encoding='utf-8') as f:
    for i, s in enumerate(model.joint.vocabulary):
        f.write(f"{s} {i}\n")
    f.write(f"<blk> {i+1}\n")

Apr 26 '24 09:04 sangeet2020

Thanks for the response @sangeet2020

I was able to export it and get 3 generated files

encoder-model.onnx
decoder_joint-model.onnx
tokens.txt

There is no joiner in the export and when using this with any online decoding file like speech-recognition-with-microphone-with-endpoint-detection.py I get this error:

My command: python3 speech-recognision-from-mircophone-with-endpoint-detection.py --encoder=encoder-model.onnx --decoder=decoder_joint-model.onnx --tokens=tokens.txt --joiner=decoder_joint-model.onnx (I assume joiner to be the same as decoder)

Error: /Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/online-transducer-model.cc:GetModelType:62 No model_type in the metadata! Please make sure you are using the latest export-onnx.py from icefall to export your transducer models /Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/online-transducer-model.cc:Create:116 Unknown model type in online transducer! zsh: segmentation fault python3 speech-recognision-from-mircophone-with-endpoint-detection.py

Any idea on how to get it to work or do we need to wait till official update is made on integrating NeMo streaming models with sherpa-onnx

Apr 30 '24 18:04 tempops

Remember that Sherpa has support for EncDecCTCModelBPE models, and not EncDecHybridRNNTCTCBPEModel The model that you are trying stt_en_fastconformer_hybrid_large_streaming_multi is EncDecHybridRNNTCTCBPEModel.

May 06 '24 14:05 sangeet2020

@csukuangfj do you think, for decoding EncDecHybridRNNTCTCBPEModel models from Nemo, do we need a completely new implementation? or can we use implementations for EncDecCTCModelBPE models?

May 06 '24 14:05 sangeet2020

I am working on it. Will finish it this work.

I will add the CTC support first.

May 07 '24 11:05 csukuangfj

Thank you. thats good. I am also working on the implementation of the RNNT decoding, but have been stuck for a while. Some errors, I cant get past through. Shall I create a PR?

May 07 '24 11:05 sangeet2020

Some errors, I cant get past through. Shall I create a PR?

Thanks! It would be great if you can share your code.

May 07 '24 11:05 csukuangfj

Thanks for supporting NeMo hybrid model export !

Side note, we fuse the decoder joint for export during RNNT export so that might complicate usage

May 08 '24 04:05 titu1994

Some errors, I cant get past through. Shall I create a PR?

Thanks! It would be great if you can share your code.

Sure. Things are a bit messy now with the c++ implementation. I will cleanup and create a PR. To give you an overview, I am trying to replicate the transducer based implementation under sherpa-onnx/csrc/ for the RNNT decoder in Nemo's EncDecHybridRNNTCTCBPEModel. First I am aiming for an offline decoding, and then online streaming decoding. Is this the right direction?

thank you

May 08 '24 08:05 sangeet2020

Thanks for supporting NeMo hybrid model export !

Side note, we fuse the decoder joint for export during RNNT export so that might complicate usage

Hi @titu1994 ,

yeah, I kind of saw it in the Nemo's source code for model export.

But, I do this instead for model export

import nemo.collections.asr as nemo_asr

model_fname="stt_en_fastconformer_hybrid_large_streaming_multi_rnnt.nemo"
onnx_model_path = 'onnx_model/'+ model_fname.split(".")[0]

!mkdir -p {onnx_model_path}

model = nemo_asr.models.EncDecCTCModelBPE.restore_from('model/'+   model_fname)

# Export ONNX model
onnx_enc_model_fname = onnx_model_path + "/" +  'encoder.onnx'
onnx_dec_model_fname = onnx_model_path + "/" +  'decoder.onnx'
onnx_joint_model_fname = onnx_model_path + "/" +  'joint.onnx'

model.encoder.export(onnx_enc_model_fname)
model.decoder.export(onnx_dec_model_fname)
model.joint.export(onnx_joint_model_fname)

Shouldnt this work as well? or no?

Thank You

May 08 '24 08:05 sangeet2020

Shouldnt this work as well? or no?

Yes, this approach also works perfectly.

May 08 '24 11:05 csukuangfj

To give you an overview, I am trying to replicate the transducer based implementation under sherpa-onnx/csrc/ for the RNNT decoder in Nemo's EncDecHybridRNNTCTCBPEModel. First I am aiming for an offline decoding, and then online streaming decoding. Is this the right direction?

Yes, that looks good to me. Looking forward to your contribution!

Note that the decoder model from NeMo is stateful, not stateless.

Please use the export script from https://github.com/k2-fsa/sherpa-onnx/pull/844

You can refer to the following script https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/nemo/fast-conformer-hybrid-transducer-ctc/test-onnx-transducer.py for how decoding is done for streaming transducer from NeMo.

By the way, to export a non-streaming model, I think you need to remove "cache_support": True from https://github.com/k2-fsa/sherpa-onnx/blob/68b25abf2712a4e6b5fa5f846fd9c23f72f5f860/scripts/nemo/fast-conformer-hybrid-transducer-ctc/export-onnx-transducer.py#L88

Please ask at any time if you have any issues.

May 08 '24 11:05 csukuangfj

Thanks for supporting NeMo hybrid model export !

Side note, we fuse the decoder joint for export during RNNT export so that might complicate usage

We have exported the decoder and joiner separately in https://github.com/k2-fsa/sherpa-onnx/pull/844

By the way, is there any plan to use a stateless decoder model in NeMo? @titu1994

May 08 '24 11:05 csukuangfj

sherpa-onnx sherpa-onnx copied to clipboard

Help for exporting FastConformer NeMo model to onnx for use in sherpa-onnx for streaming inference

sherpa-onnx
sherpa-onnx copied to clipboard