sherpa-onnx
sherpa-onnx copied to clipboard
Adding new words using Hotwords option with multiple variants per word.
trafficstars
I have encountered an issue with the word PRECENT not being recognized. I attempted to generate a BPE for this word using the tool sherpa-onnx/scripts/text2token.py, which provided only one variant: ▁PER C ENT. However, it is not recognized in most cases.
Is there any way to produce more variants per word to ensure recognition, or should I adjust parameters such as boosting score, trigger threshold for this word, or increase the hotwords_score?