icefall Export conformer_ctc3 streaming model to jit trace

Could you please add support for streaming model export to jit trace format?

Apr 13 '23 06:04 pavankumar-ds

Hi @csukuangfj, Could you please let us know any update on this? Thanks

Apr 17 '23 07:04 uni-manjunath-ke

Hi @csukuangfj, Could you please let us know any update on this? Thanks

Sorry, conformer_ctc3 does not support streaming recognition. please switch to streaming zipformer if possible.

https://github.com/k2-fsa/icefall/pull/941 is for zipformer + ctc

Apr 18 '23 03:04 csukuangfj

Thanks @csukuangfj . We are using zipformer_ctc implementation at https://github.com/k2-fsa/icefall/pull/941 . Hope this is streaming variant and it supports streaming. Pls confirm. Thank you.

Apr 24 '23 04:04 uni-manjunath-ke

@uni-manjunath-ke

#941 combines zipformer with ctc, however, it is not streaming.

please use pruned_transducer_stateless7_streaming, which is a streaming version. If you want to use CTC, please combine pruned_transducer_stateless7_streaming with #941

Apr 24 '23 04:04 csukuangfj

Thanks @csukuangfj ..

How do I do that? Is it something like the difference between pruned_transducer_stateless7 and zipformer_ctc to be understood and then adapted to pruned_transducer_stateless7_streaming ? Is this correct. Pls suggest. Thanks. Tagging @pavankumar-ds

Apr 24 '23 04:04 uni-manjunath-ke

pruned_transducer_stateless7 and zipformer_ctc share the same zipformer.py

Please first have a look at pruned_transducer_stateless7_streaming. After you read the code, I believe you will know it.

Apr 24 '23 04:04 csukuangfj

Hi @csukuangfj and @desh2608 , We had gone through the code, and we felt that it might take some time by us to understand it thoroughly, and implement it by ourselves. But, meanwhile just wanted to check, if you have any plan to implement this streaming version of zipformer_ctc. Considering this as a request, is it possible to implement this streaming version of zipformer_ctc?. Thanks.

Apr 29 '23 15:04 uni-manjunath-ke

Sorry, I don't have the bandwidth for this at the moment. As Fangjun mentioned, it should be relatively straightforward to create such a recipe based on pruned_transducer_stateless7_streaming and zipformer_ctc. You can basically just copy over the zipformer_ctc files. Then change the following:

Replace zipformer.py with the one from pruned_transducer_stateless7_streaming. This is the streaming variant of Zipformer.
In train.py, add the "chunk" related arguments (see here). Also search for "chunk" in that file and add all those things to train.py in zipformer_ctc.
Similarly, look for "chunk" in decode.py of pruned_transducer_stateless7_streaming, and add those in decode.py of zipformer_ctc.

I think these are all the changes needed.

Apr 29 '23 15:04 desh2608

Sure, Thanks a lot for detailed steps. We will work on it and update you further.

May 02 '23 04:05 uni-manjunath-ke

Hi @desh2608 & @csukuangfj , Thanks for your suggestions. We were able to make suggested modifications and train a model for zipformer ctc streaming. Could you please let us know if we can push this zipformer_ctc_streaming recipe to repository. Thanks

May 08 '23 16:05 uni-manjunath-ke

Is there any recipe in sherpa triton for zipformer ctc streaming ?

May 11 '23 08:05 uni-saurabh-vyas

Hi @csukuangfj and @desh2608 , We tried to export zipformer_ctc_streaming to jit format. But, we are getting below errors. We also tried to import it to onnx format, and the code changes that we made for export_onnx.py were gave some errors. Could you please suggest further on this. Thanks.

build/sherpa/./bin/sherpa-online --nn-model=/mnt/efs/manju/if/icefall/egs/librispeech/ASR/zipformer_ctc_streaming/exp/cpu_jit.pt --tokens=/mnt/efs/manju/if/icefall/egs/librispeech/ASR/data/in_en/lang_bpe_500/./tokens.txt --use-gpu=true --decoding-method=greedy_search /mnt/efs/manju/if/tools/16pc m_re_test__q_vNeC-nX4X0LWuORXDmp_l_0001.wav [I] /mnt/efs/manju/if/tools/sherpa/sherpa/csrc/parse-options.cc:495:int sherpa::ParseOptions::Read(int, const char* const*) 2023-05-11 08:17:26.563 build/sherpa/./bin/sherpa-online --nn-model=/mnt/efs/manju/if/icefall/egs/librispeech/ASR/zipformer_ctc_streaming/exp/cpu_jit.pt --tokens=/mnt/efs/manju/if/icefall/egs/librispeech/ASR/data/in_en/lang_bpe_500/./tokens.txt --use-gpu=true --decoding-method=greedy_search /mnt/efs/manju/if/tools/16pcm_re_test__q_vNeC-nX4X0LWuORXDmp_l_0001.wav

[I] /mnt/efs/manju/if/tools/sherpa/sherpa/cpp_api/bin/online-recognizer.cc:145:int32_t main(int32_t, char**) 2023-05-11 08:17:26.567 decoding method: greedy_search

Aborted (core dumped)

May 11 '23 08:05 uni-manjunath-ke

We tried to export zipformer_ctc_streaming to jit format.

build/sherpa/./bin/sherpa-online

k2-fsa/sherpa supports only streaming transducers. If you could contribute a streaming CTC model, we can add that to k2-fsa/sherpa.

May 11 '23 08:05 csukuangfj

Yes, I think onnx recipe for sherpa with triton for ctc zipformer would be nice to have.

May 11 '23 08:05 uni-saurabh-vyas

Yes, I think onnx recipe for sherpa with triton for ctc zipformer would be nice to

We currently don't have such a recipe in icefall. Would you mind contributing one to icefall and make the model public so that we can use it for testing when adding it to sherpa?

May 11 '23 08:05 csukuangfj

upports only streaming transducers. If you could contribute a streaming CTC model, we can add that to k2-fsa/sherpa

If a share a zipformer ctc streaming model . will that be fine?. Thanks

May 11 '23 09:05 uni-manjunath-ke

We want more people to benefit from the code. If we only have a pre-trained model, then users (other than you) won't have code to train their own model and the code is mostly only can be used by you.

May 11 '23 09:05 csukuangfj

Adding to @uni-manjunath-ke 's points, yes, we'd also like to add the recipe to icefall. We'll take a few days to run it with standard librispeech and include the WER.

May 11 '23 09:05 pavankumar-ds

We want more people to benefit from the code. If we only have a pre-trained model, then users (other than you) won't have code to train their own model and the code is mostly only can be used by you.

Sure, Could you please guide us on how to we push our zipformer-ctc-streaming code. Thanks

May 11 '23 11:05 uni-manjunath-ke

Could you follow https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request to make a pull-request to icefall?

May 11 '23 11:05 csukuangfj

Hi @csukuangfj , I have created a fork and uploaded zipformer_ctc_streaming at https://github.com/uni-manjunath-ke/icefall/tree/zipformer_ctc_streaming/egs/librispeech/ASR/zipformer_ctc_streaming

We have also ran this code for Librispeech. These are our results: using avg 15 WER 10.51% test-other WER 4.07% test-clean

using avg 9 WER 10.30% test-other WER 4.0% test-clean

Please let us know further steps. Thanks

May 31 '23 08:05 uni-manjunath-ke

We have also ran this code for Librispeech. These are our results:

How much data have you used to train the model? train-clean-100 or the full librispeech (960 hours)?
How many epochs have you run? Are the posted number the best after searching different combinations of --epoch --avg?
Which decoding method are you using?
Could you make a pull request first?
Could you update RESULTS.md to include your results? You can find in RESULS.md the information you need to fill in by following other folders.

Thanks!

May 31 '23 09:05 csukuangfj

Thanks. Updated RESULTS.md and created a pull request at https://github.com/k2-fsa/icefall/pull/1106

May 31 '23 10:05 uni-manjunath-ke