sherpa-onnx
sherpa-onnx copied to clipboard
Add C++ runtime for *streaming* faster conformer transducer from NeMo.
This PR is to integrate Nemo's faster conformer transducer into sherpa-decoder. More commits to be added.
@csukuangfj would we need StackStates and UnStackStates methods for this?
@csukuangfj would we need StackStates and UnStackStates methods for this?
Yes, please refer to https://github.com/k2-fsa/sherpa-onnx/blob/8af2af84664d3285ba452bf453bb928a3eb6e978/sherpa-onnx/csrc/online-nemo-ctc-model.cc#L121-L122
and
https://github.com/k2-fsa/sherpa-onnx/blob/8af2af84664d3285ba452bf453bb928a3eb6e978/sherpa-onnx/csrc/online-nemo-ctc-model.cc#L156-L157
Note that for decoding, you can support only batch_size == 1.
Hi @csukuangfj ,
could you please help me with online-transducer-greedy-search-nemo-decoder.cc. A basic outline should be good to start with.
Thank you
-
Please refer to our Python example for online NeMo transducer greedy search decoding https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/nemo/fast-conformer-hybrid-transducer-ctc/test-onnx-transducer.py
-
For simplicity, please support only batch size == 1 for greedy search
-
Please refer to the offline NeMo transducer greedy search decoding in C++ at https://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/csrc/offline-transducer-greedy-search-nemo-decoder.h
All you need is to change the offline C++ version to an online version.
- NeMo transducer is stateful so you need to follow https://github.com/k2-fsa/sherpa-onnx/blob/8af2af84664d3285ba452bf453bb928a3eb6e978/sherpa-onnx/csrc/online-stream.h#L91-L92
to add two methods, .e.g.,
void SetNeMoDecoderStates(std::vector<Ort::Value> states);
std::vector<Ort::Value> &GetNeMoDecoderStates();
- You need to follow https://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/csrc/offline-recognizer-transducer-nemo-impl.h to add
online-recognizer-transducer-nemo-impl.h
@csukuangfj could you review these changes please. Waiting for your feedback.
Also, could you assist me with online-transducer-greedy-search-nemo-decoder.cc. Following offline-transducer-greedy-search-nemo-decoder.cc is not so helpful in this case, as its a streaming mode
Thank You
By the way, you need to change https://github.com/k2-fsa/sherpa-onnx/blob/81346d11728e675ddea2645738b394a8b82078d3/sherpa-onnx/csrc/online-recognizer-impl.cc#L15-L17
and
https://github.com/k2-fsa/sherpa-onnx/blob/81346d11728e675ddea2645738b394a8b82078d3/sherpa-onnx/csrc/online-recognizer-impl.cc#L36-L38
You can use the number of outputs from the decoder model to decide whether to create a normal OnlineRecognizerTransducerImpl or OnlineRecognizerTransducerNeMoImpl.
You can refer to
https://github.com/k2-fsa/sherpa-onnx/blob/81346d11728e675ddea2645738b394a8b82078d3/sherpa-onnx/csrc/online-transducer-model.cc#L45
to create a session for the decoder model
and refer to the following code to get the number of outputs for the decoder model
https://github.com/k2-fsa/sherpa-onnx/blob/81346d11728e675ddea2645738b394a8b82078d3/sherpa-onnx/csrc/onnx-utils.cc#L38
You only need to support two kinds of transducer models in sherpa-onnx: one for stateless transducer, and one for NeMo stateful transducer.
Following offline-transducer-greedy-search-nemo-decoder.cc is not so helpful in this case, as its a streaming mode
We have both a C++ and a Python version for the non-streaming nemo transducer greedy search and a Python version for streaming NeMo transducer greed search.
Please read them carefully. The only differences from the non-streaming one:
- You need to process chunk-by-chunk, where there are already code examples for stateless streaming transducer and for stateful NeMo CTC model
- You need to save the decoder states across chunks
Hi @csukuangfj , Thank you for the feedback. i have made necessary changes as you said above. Can you please review it once.
Thank You
By the way, please make sure the code compiles successfully on your computer.
Hi @csukuangfj,
I am unable to pin-point and solve this compilation error. Could you please take a look.
[ 56%] Building CXX object sherpa-onnx/csrc/CMakeFiles/sherpa-onnx-core.dir/online-recognizer-impl.cc.o
In file included from /usr/include/c++/11/memory:76,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.h:8,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.cc:5:
/usr/include/c++/11/bits/unique_ptr.h: In instantiation of ‘typename std::_MakeUniq<_Tp>::__single_object std::make_unique(_Args&& ...) [with _Tp = sherpa_onnx::OnlineTransducerModifiedBeamSearchDecoder; _Args = {sherpa_onnx::OnlineTransducerModel*, sherpa_onnx::OnlineLM*, int&, float&, int&, float&, float&}; typename std::_MakeUniq<_Tp>::__single_object = std::unique_ptr<sherpa_onnx::OnlineTransducerModifiedBeamSearchDecoder, std::default_delete<sherpa_onnx::OnlineTransducerModifiedBeamSearchDecoder> >]’:
/mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-transducer-impl.h:109:77: required from here
/usr/include/c++/11/bits/unique_ptr.h:962:30: error: invalid new-expression of abstract class type ‘sherpa_onnx::OnlineTransducerModifiedBeamSearchDecoder’
962 | { return unique_ptr<_Tp>(new _Tp(std::forward<_Args>(__args)...)); }
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-transducer-impl.h:30,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.cc:9:
/mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-transducer-modified-beam-search-decoder.h:18:7: note: because the following virtual functions are pure within ‘sherpa_onnx::OnlineTransducerModifiedBeamSearchDecoder’:
18 | class OnlineTransducerModifiedBeamSearchDecoder
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-stream.h:17,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer.h:22,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.h:13,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.cc:5:
/mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-transducer-decoder.h:85:35: note: ‘virtual std::vector<Ort::Value> sherpa_onnx::OnlineTransducerDecoder::Decode_me(Ort::Value, std::vector<Ort::Value>, std::vector<sherpa_onnx::OnlineTransducerDecoderResult>*, sherpa_onnx::OnlineStream**, int32_t)’
85 | virtual std::vector<Ort::Value> Decode_me(Ort::Value encoder_out,
| ^~~~~~~~~
In file included from /usr/include/c++/11/memory:76,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.h:8,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.cc:5:
/usr/include/c++/11/bits/unique_ptr.h: In instantiation of ‘typename std::_MakeUniq<_Tp>::__single_object std::make_unique(_Args&& ...) [with _Tp = sherpa_onnx::OnlineTransducerGreedySearchDecoder; _Args = {sherpa_onnx::OnlineTransducerModel*, int&, float&, float&}; typename std::_MakeUniq<_Tp>::__single_object = std::unique_ptr<sherpa_onnx::OnlineTransducerGreedySearchDecoder, std::default_delete<sherpa_onnx::OnlineTransducerGreedySearchDecoder> >]’:
/mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-transducer-impl.h:115:71: required from here
/usr/include/c++/11/bits/unique_ptr.h:962:30: error: invalid new-expression of abstract class type ‘sherpa_onnx::OnlineTransducerGreedySearchDecoder’
962 | { return unique_ptr<_Tp>(new _Tp(std::forward<_Args>(__args)...)); }
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-transducer-impl.h:28,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.cc:9:
/mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-transducer-greedy-search-decoder.h:15:7: note: because the following virtual functions are pure within ‘sherpa_onnx::OnlineTransducerGreedySearchDecoder’:
15 | class OnlineTransducerGreedySearchDecoder : public OnlineTransducerDecoder {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-stream.h:17,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer.h:22,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.h:13,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.cc:5:
/mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-transducer-decoder.h:85:35: note: ‘virtual std::vector<Ort::Value> sherpa_onnx::OnlineTransducerDecoder::Decode_me(Ort::Value, std::vector<Ort::Value>, std::vector<sherpa_onnx::OnlineTransducerDecoderResult>*, sherpa_onnx::OnlineStream**, int32_t)’
85 | virtual std::vector<Ort::Value> Decode_me(Ort::Value encoder_out,
| ^~~~~~~~~
In file included from /usr/include/c++/11/memory:76,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.h:8,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.cc:5:
/usr/include/c++/11/bits/unique_ptr.h: In instantiation of ‘typename std::_MakeUniq<_Tp>::__single_object std::make_unique(_Args&& ...) [with _Tp = sherpa_onnx::OnlineTransducerGreedySearchNeMoDecoder; _Args = {sherpa_onnx::OnlineTransducerNeMoModel*, float&}; typename std::_MakeUniq<_Tp>::__single_object = std::unique_ptr<sherpa_onnx::OnlineTransducerGreedySearchNeMoDecoder, std::default_delete<sherpa_onnx::OnlineTransducerGreedySearchNeMoDecoder> >]’:
/mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-transducer-nemo-impl.h:53:75: required from here
/usr/include/c++/11/bits/unique_ptr.h:962:30: error: invalid new-expression of abstract class type ‘sherpa_onnx::OnlineTransducerGreedySearchNeMoDecoder’
962 | { return unique_ptr<_Tp>(new _Tp(std::forward<_Args>(__args)...)); }
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-transducer-nemo-impl.h:26,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.cc:10:
/mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.h:15:7: note: because the following virtual functions are pure within ‘sherpa_onnx::OnlineTransducerGreedySearchNeMoDecoder’:
15 | class OnlineTransducerGreedySearchNeMoDecoder : public OnlineTransducerDecoder {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-stream.h:17,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer.h:22,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.h:13,
from /mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-recognizer-impl.cc:5:
/mnt/local/sangeet/workncode/k2-fsa/fang/sherpa-onnx/sherpa-onnx/csrc/online-transducer-decoder.h:82:16: note: ‘virtual void sherpa_onnx::OnlineTransducerDecoder::Decode(Ort::Value, std::vector<sherpa_onnx::OnlineTransducerDecoderResult>*)’
82 | virtual void Decode(Ort::Value encoder_out,
| ^~~~~~
cc1plus: note: unrecognized command-line option ‘-Wno-missing-template-keyword’ may have been intended to silence earlier diagnostics
make[2]: *** [sherpa-onnx/csrc/CMakeFiles/sherpa-onnx-core.dir/build.make:832: sherpa-onnx/csrc/CMakeFiles/sherpa-onnx-core.dir/online-recognizer-impl.cc.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:1552: sherpa-onnx/csrc/CMakeFiles/sherpa-onnx-core.dir/all] Error 2
make: *** [Makefile:156: all] Error 2
I suggest that you copy & paste our C++ greedy search decoding code for non-streaming stateful NeMo transducer and then change the code to handle the states of the decoder model.
Almost everything you need is already there.
Hi @csukuangfj , I really appreciate all your help throughout . Can I please request you fix the greedy decoder implementation ..been stuck for quite some now, and cant get any way through this. thank you
Sure, will push new commits to your branch this week.
Hi @csukuangfj , I made some minor changes. As of now, there are no errors, decoding works. but the predictions are correct only upto few decoding streams, then it starts incorrect predictions.
To give you an example..
CORRECT PREDICTION: after early nightfall the yellow lamps...
CURRENT PREDICTION:
I have the suspicion that something is wrong inside the greedy search decoder implementation.
You are almost there!
I am merging it and take care of the rest.
Thank you for your contribution!