Export conformer_ctc3 streaming model to jit trace
Hi @csukuangfj, Could you please let us know any update on this? Thanks
Hi @csukuangfj, Could you please let us know any update on this? Thanks
Sorry, conformer_ctc3 does not support streaming recognition. please switch to streaming zipformer if possible.
https://github.com/k2-fsa/icefall/pull/941 is for zipformer + ctc
Thanks @csukuangfj . We are using zipformer_ctc implementation at https://github.com/k2-fsa/icefall/pull/941 . Hope this is streaming variant and it supports streaming. Pls confirm. Thank you.
@uni-manjunath-ke
#941 combines zipformer with ctc, however, it is not streaming.
please use pruned_transducer_stateless7_streaming, which is a streaming version. If you want to use CTC, please combine pruned_transducer_stateless7_streaming with #941
Thanks @csukuangfj ..
How do I do that? Is it something like the difference between pruned_transducer_stateless7 and zipformer_ctc to be understood and then adapted to pruned_transducer_stateless7_streaming ? Is this correct. Pls suggest. Thanks. Tagging @pavankumar-ds
pruned_transducer_stateless7 and zipformer_ctc share the same zipformer.py
Please first have a look at pruned_transducer_stateless7_streaming. After you read the code, I believe you will know it.
Hi @csukuangfj and @desh2608 , We had gone through the code, and we felt that it might take some time by us to understand it thoroughly, and implement it by ourselves. But, meanwhile just wanted to check, if you have any plan to implement this streaming version of zipformer_ctc. Considering this as a request, is it possible to implement this streaming version of zipformer_ctc?. Thanks.
Sorry, I don't have the bandwidth for this at the moment. As Fangjun mentioned, it should be relatively straightforward to create such a recipe based on pruned_transducer_stateless7_streaming and zipformer_ctc. You can basically just copy over the zipformer_ctc files. Then change the following:
- Replace zipformer.py with the one from pruned_transducer_stateless7_streaming. This is the streaming variant of Zipformer.
- In train.py, add the "chunk" related arguments (see here). Also search for "chunk" in that file and add all those things to train.py in zipformer_ctc.
- Similarly, look for "chunk" in decode.py of pruned_transducer_stateless7_streaming, and add those in decode.py of zipformer_ctc.
I think these are all the changes needed.
Sure, Thanks a lot for detailed steps. We will work on it and update you further.
Hi @desh2608 & @csukuangfj , Thanks for your suggestions. We were able to make suggested modifications and train a model for zipformer ctc streaming. Could you please let us know if we can push this zipformer_ctc_streaming recipe to repository. Thanks
Is there any recipe in sherpa triton for zipformer ctc streaming ?
Hi @csukuangfj and @desh2608 , We tried to export zipformer_ctc_streaming to jit format. But, we are getting below errors. We also tried to import it to onnx format, and the code changes that we made for export_onnx.py were gave some errors. Could you please suggest further on this. Thanks.
build/sherpa/./bin/sherpa-online --nn-model=/mnt/efs/manju/if/icefall/egs/librispeech/ASR/zipformer_ctc_streaming/exp/cpu_jit.pt --tokens=/mnt/efs/manju/if/icefall/egs/librispeech/ASR/data/in_en/lang_bpe_500/./tokens.txt --use-gpu=true --decoding-method=greedy_search /mnt/efs/manju/if/tools/16pc m_re_test__q_vNeC-nX4X0LWuORXDmp_l_0001.wav [I] /mnt/efs/manju/if/tools/sherpa/sherpa/csrc/parse-options.cc:495:int sherpa::ParseOptions::Read(int, const char* const*) 2023-05-11 08:17:26.563 build/sherpa/./bin/sherpa-online --nn-model=/mnt/efs/manju/if/icefall/egs/librispeech/ASR/zipformer_ctc_streaming/exp/cpu_jit.pt --tokens=/mnt/efs/manju/if/icefall/egs/librispeech/ASR/data/in_en/lang_bpe_500/./tokens.txt --use-gpu=true --decoding-method=greedy_search /mnt/efs/manju/if/tools/16pcm_re_test__q_vNeC-nX4X0LWuORXDmp_l_0001.wav
[I] /mnt/efs/manju/if/tools/sherpa/sherpa/cpp_api/bin/online-recognizer.cc:145:int32_t main(int32_t, char**) 2023-05-11 08:17:26.567 decoding method: greedy_search
Aborted (core dumped)
We tried to export zipformer_ctc_streaming to jit format.
build/sherpa/./bin/sherpa-online
k2-fsa/sherpa supports only streaming transducers. If you could contribute a streaming CTC model, we can add that to k2-fsa/sherpa.
Yes, I think onnx recipe for sherpa with triton for ctc zipformer would be nice to have.
Yes, I think onnx recipe for sherpa with triton for ctc zipformer would be nice to
We currently don't have such a recipe in icefall. Would you mind contributing one to icefall and make the model public so that we can use it for testing when adding it to sherpa?
upports only streaming transducers. If you could contribute a streaming CTC model, we can add that to
k2-fsa/sherpa
If a share a zipformer ctc streaming model . will that be fine?. Thanks
We want more people to benefit from the code. If we only have a pre-trained model, then users (other than you) won't have code to train their own model and the code is mostly only can be used by you.
Adding to @uni-manjunath-ke 's points, yes, we'd also like to add the recipe to icefall. We'll take a few days to run it with standard librispeech and include the WER.
We want more people to benefit from the code. If we only have a pre-trained model, then users (other than you) won't have code to train their own model and the code is mostly only can be used by you.
Sure, Could you please guide us on how to we push our zipformer-ctc-streaming code. Thanks
Could you follow https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request to make a pull-request to icefall?
Hi @csukuangfj , I have created a fork and uploaded zipformer_ctc_streaming at https://github.com/uni-manjunath-ke/icefall/tree/zipformer_ctc_streaming/egs/librispeech/ASR/zipformer_ctc_streaming
We have also ran this code for Librispeech. These are our results: using avg 15 WER 10.51% test-other WER 4.07% test-clean
using avg 9 WER 10.30% test-other WER 4.0% test-clean
Please let us know further steps. Thanks
We have also ran this code for Librispeech. These are our results:
- How much data have you used to train the model? train-clean-100 or the full librispeech (960 hours)?
- How many epochs have you run? Are the posted number the best after searching different combinations of
--epoch --avg? - Which decoding method are you using?
- Could you make a pull request first?
- Could you update RESULTS.md to include your results? You can find in RESULS.md the information you need to fill in by following other folders.
Thanks!
Thanks. Updated RESULTS.md and created a pull request at https://github.com/k2-fsa/icefall/pull/1106