sherpa-onnx icon indicating copy to clipboard operation
sherpa-onnx copied to clipboard

Re-implement LM rescore for online transducer

Open SilverSulfide opened this issue 1 year ago • 0 comments

Shallow fusion can be too slow for online cpu inference. Added an option to use classical LM rescore instead.

  • Rescore implementation based on https://github.com/k2-fsa/sherpa-onnx/pull/133
  • Shallow fusion enabled by default
  • Pass lm-shallow-fusion=false to enable rescore instead
  • Updated online Python API

CPU runtime comparison for ~1 min wav file using the default --lm-num-threads=1

Method Runtime (s) RTF
No LM 5.7 0.098
Rescore 40 0.68
Shallow Fusion 69 1.2

SilverSulfide avatar Aug 07 '24 10:08 SilverSulfide