Fangjun Kuang
Fangjun Kuang
They both support real-time speech recognition on embedded devices. Please see - https://github.com/k2-fsa/sherpa-onnx - https://github.com/k2-fsa/sherpa-ncnn Note: Please use our latest streaming zipformer for testing.
Please see https://livejohnshopkins-my.sharepoint.com/:p:/g/personal/mwiesne2_jh_edu/EYqRDl8cIr5BsVDxi1MOW5EBUpdqh10WFkzqixPIFM63hg?e=u3lrmL
The steps to do that are given below. ## Streaming models 1. Add a member after the following line https://github.com/k2-fsa/sherpa/blob/4254d4a302bc7bc2497900d7474dcc29bbc23b9f/sherpa/cpp_api/online-recognizer.h#L70 ```cpp // temperature for the softmax in the joiner float...
We plan to add CTC decoding support for streaming models with graph(s) in C++. As for the models, they need not necessarily come from [icefall](https://github.com/k2-fsa/icefall). As long as there is...
https://github.com/k2-fsa/sherpa-onnx/pull/147 added shallow fusion with RNNLM. We also need to support it in sherpa The first step is to use https://github.com/k2-fsa/icefall/pull/1050 as a reference to export the model via torchscript....
We currently have - https://github.com/k2-fsa/sherpa/blob/master/sherpa/csrc/endpoint.h - https://github.com/k2-fsa/sherpa/blob/master/sherpa/csrc/endpoint.cc but they are not used in online ASR. The steps to add it to online ASR are: (1) Move endpoint.h from `csrc` to...
The following page lists all supported CTC models currently we have in sherpa https://k2-fsa.github.io/sherpa/cpp/pretrained_models/offline_ctc.html Namely, models from icefall, wenet, and torchaudio (wav2vec 2.0) are supported. --- Basically, sherpa can support...
Currently, the WebSocket client sends samples in float32 format. We need to replace it with int16 to save bandwidth.
Help from the community is appreciated. We can provide help to accomplish this task.