sherpa-onnx
sherpa-onnx copied to clipboard
Re-implement LM rescore for online transducer
Shallow fusion can be too slow for online cpu inference. Added an option to use classical LM rescore instead.
- Rescore implementation based on https://github.com/k2-fsa/sherpa-onnx/pull/133
- Shallow fusion enabled by default
- Pass lm-shallow-fusion=false to enable rescore instead
- Updated online Python API
CPU runtime comparison for ~1 min wav file using the default --lm-num-threads=1
| Method | Runtime (s) | RTF |
|---|---|---|
| No LM | 5.7 | 0.098 |
| Rescore | 40 | 0.68 |
| Shallow Fusion | 69 | 1.2 |