vosk-api icon indicating copy to clipboard operation
vosk-api copied to clipboard

How to increase the accuracy of the vosk module in Python?

Open amina1403 opened this issue 1 year ago • 8 comments

How to maximize vosk module accuracy? I use the «large» model Thanks

codes: model = Model("vosk-model-fa-0.5") wf = wave.open("audio.wav", "rb")

rec = KaldiRecognizer(model, wf.getframerate())

while True: data = wf.readframes(8000) if len(data) == 0: break if rec.AcceptWaveform(data): a=rec.Result() else: a=rec.PartialResult()

a=rec.Result()

amina1403 avatar May 26 '23 13:05 amina1403

You can provide us 1000 hours of audio data to let us build more accurate model

nshmyrev avatar May 26 '23 21:05 nshmyrev

You can provide us 1000 hours of audio data to let us build more accurate model

1000 hours! It would be difficult for me to do something like that, but it's not impossible, because there are so many free audio books and podcasts. I read something that says accuracy increases as ltsm increases, is that true? And how is it? What changes can be made in the codes to increase the accuracy?

What changes would you recommend in general, that would increase accuracy by at least 10%?

amina1403 avatar May 27 '23 09:05 amina1403

We do not need audiobooks, we need real-life data. What is the application you want to build? What particular audio are you going to recognize.

nshmyrev avatar May 27 '23 14:05 nshmyrev

We do not need audiobooks, we need real-life data. What is the application you want to build? What particular audio are you going to recognize.

A software that types and delivers an audio file. Of course, I have done this, but the competitors who are in this field are between 10 and 20 percent more accurate.

amina1403 avatar May 27 '23 17:05 amina1403

You can share the data with us to catch up.

nshmyrev avatar May 29 '23 12:05 nshmyrev

Changing the following values ​​has an impact on the accuracy of the vosk module?

model.conf: --min-active=200 --max-active=3000 --beam=10.0 --lattice-beam=2.0 --acoustic-scale=1.0 --frame-subsampling-factor=3 --endpoint.silence-phones=1:2:3:4:5:6:7:8:9:10 --endpoint.rule2.min-trailing-silence=0.5 --endpoint.rule3.min-trailing-silence=1.0 --endpoint.rule4.min-trailing-silence=2.0

mfcc.conf: --use-energy=false --num-mel-bins=20 --num-ceps=20 --low-freq=20 --high-freq=7600

amina1403 avatar Jun 01 '23 16:06 amina1403

Changing the following values ​​has an impact on the accuracy of the vosk module?

Yes

nshmyrev avatar Jun 01 '23 21:06 nshmyrev

Changing the following values ​​has an impact on the accuracy of the vosk module?

Yes

Changing which one will increase the accuracy of the module?

amina1403 avatar Jun 02 '23 11:06 amina1403