Fangjun Kuang issues

Results 165 issues of


                                            Fangjun Kuang

Request to add sherpa-ncnn and sherpa-onnx

They both support real-time speech recognition on embedded devices. Please see - https://github.com/k2-fsa/sherpa-onnx - https://github.com/k2-fsa/sherpa-ncnn Note: Please use our latest streaming zipformer for testing.

FYI: Slides for the Interspeech 2023 tutorial

Please see https://livejohnshopkins-my.sharepoint.com/:p:/g/personal/mwiesne2_jh_edu/EYqRDl8cIr5BsVDxi1MOW5EBUpdqh10WFkzqixPIFM63hg?e=u3lrmL

Add temperature to softmax

The steps to do that are given below. ## Streaming models 1. Add a member after the following line https://github.com/k2-fsa/sherpa/blob/4254d4a302bc7bc2497900d7474dcc29bbc23b9f/sherpa/cpp_api/online-recognizer.h#L70 ```cpp // temperature for the softmax in the joiner float...

[Feature proposal] Support CTC decoding with graph(s) for streaming models

We plan to add CTC decoding support for streaming models with graph(s) in C++. As for the models, they need not necessarily come from [icefall](https://github.com/k2-fsa/icefall). As long as there is...

Support shallow fusion with RNNLM

https://github.com/k2-fsa/sherpa-onnx/pull/147 added shallow fusion with RNNLM. We also need to support it in sherpa The first step is to use https://github.com/k2-fsa/icefall/pull/1050 as a reference to export the model via torchscript....

help wanted

Support automatic-mixed-precision (AMP)

ready

cpp

[help wanted] Add endpointing

We currently have - https://github.com/k2-fsa/sherpa/blob/master/sherpa/csrc/endpoint.h - https://github.com/k2-fsa/sherpa/blob/master/sherpa/csrc/endpoint.cc but they are not used in online ASR. The steps to add it to online ASR are: (1) Move endpoint.h from `csrc` to...

help wanted

[help wanted] Support more models trained with CTC loss

The following page lists all supported CTC models currently we have in sherpa https://k2-fsa.github.io/sherpa/cpp/pretrained_models/offline_ctc.html Namely, models from icefall, wenet, and torchaudio (wav2vec 2.0) are supported. --- Basically, sherpa can support...

help wanted

[help wanted] Send int16 samples

Currently, the WebSocket client sends samples in float32 format. We need to replace it with int16 to save bandwidth.

help wanted

[help wanted] Add documentation describing how to support a new CTC model in sherpa

Help from the community is appreciated. We can provide help to accomplish this task.