sherpa-onnx icon indicating copy to clipboard operation
sherpa-onnx copied to clipboard

WIP: Add shallow fusion to C API

Open csukuangfj opened this issue 2 years ago • 10 comments
trafficstars

Integrate changes from https://github.com/k2-fsa/sherpa-onnx/pull/147

TODOs

  • [ ] Fix iOS demo
  • [ ] Fix .Net demo
  • [ ] Fix Python APIs

csukuangfj avatar May 11 '23 04:05 csukuangfj

Hi, I'm using OnlineLMConfig by Python APIs, but it seams that it didn't work by set 'rnn lm onnx path' to 'model',may I ask what should I do to using shallow fusion by Python APIs?

kamirdin avatar Oct 07 '23 06:10 kamirdin

but it seams that it didn't work by set 'rnn lm onnx path' to 'model'

How do you tell it does not work?

csukuangfj avatar Oct 07 '23 06:10 csukuangfj

How do you tell it does not work?

By decoding a test set which has 1k samples, say 10k characters, but got exactly the same result compared to not using LM.

kamirdin avatar Oct 07 '23 06:10 kamirdin

Before this, I obtained better results by using Python LM decode scripts from Icefall. This was with the same test set, ASR, and LM model. So, it was expected that better results would be achieved, even if only a few characters were changed.

kamirdin avatar Oct 07 '23 06:10 kamirdin

haracters, but got exactly the same result compared to not using LM.

How many lm scales have you tried?

csukuangfj avatar Oct 07 '23 06:10 csukuangfj

python code :

lm_config = OnlineLMConfig(
    model=lm,
    scale=scale,
)

print(lm_config)
print("="*30)

recognizer_config = OnlineRecognizerConfig(
    feat_config=feat_config,
    model_config=model_config,
    lm_config=lm_config,
    endpoint_config=endpoint_config,
    enable_endpoint=enable_endpoint_detection,
    decoding_method=decoding_method,
    max_active_paths=max_active_paths,
    context_score=context_score,
)
print(recognizer_config)

and than print out this:

OnlineLMConfig(model="base/with-state-epoch-21-avg-2.onnx", scale=1.1)
==============================
OnlineRecognizerConfig(feat_config=FeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OnlineTransducerModelConfig(encoder_filename="./asr_model_chunk320/encoder.onnx", decoder_filename="./asr_model_chunk320/decoder.onnx", joiner_filename="./asr_model_chunk320/joiner.onnx", tokens="./asr_model_chunk320/tokens.txt", num_threads=8, provider="cpu", model_type="", debug=False), lm_config=OnlineLMConfig(model="", scale=0.5), endpoint_config=EndpointConfig(rule1=EndpointRule(must_contain_nonsilence=False, min_trailing_silence=2.4, min_utterance_length=0), rule2=EndpointRule(must_contain_nonsilence=True, min_trailing_silence=1.2, min_utterance_length=0), rule3=EndpointRule(must_contain_nonsilence=False, min_trailing_silence=0, min_utterance_length=20)), enable_endpoint=False, max_active_paths=4, context_score=1.5, decoding_method="modified_beam_search")

added .def_readwrite("lm_config", &PyClass::lm_config) at here: https://github.com/k2-fsa/sherpa-onnx/blob/36017d49c4f0b2f2f87feeeb0a40e54be4487b76/sherpa-onnx/python/csrc/online-recognizer.cc#L39 and rebuild, still print lm_config=OnlineLMConfig(model="", scale=0.5), of recognizer_config

kamirdin avatar Oct 07 '23 07:10 kamirdin

by adding lm_config(lm_config), at https://github.com/k2-fsa/sherpa-onnx/blob/36017d49c4f0b2f2f87feeeb0a40e54be4487b76/sherpa-onnx/csrc/online-recognizer.h#L97C16-L97C16 It appears to be functioning, but it is encountering more missing errors compared to the Python script. It seems like there is still some work to be done on it. In any case, thank you for your assistance!

kamirdin avatar Oct 07 '23 08:10 kamirdin

https://github.com/k2-fsa/sherpa-onnx/blob/36017d49c4f0b2f2f87feeeb0a40e54be4487b76/sherpa-onnx/csrc/online-recognizer.h#L97C16-L97C16

Thank you for identifying the bug. Would you mind creating a PR to fix it?

csukuangfj avatar Oct 07 '23 08:10 csukuangfj

Sure, I will create PR after more test finish

kamirdin avatar Oct 07 '23 08:10 kamirdin

I'd like to use the online LM with the C API. What's the status on this?

rkjaran avatar Dec 21 '23 10:12 rkjaran