RealtimeSTT icon indicating copy to clipboard operation
RealtimeSTT copied to clipboard

The accuracy issue of real-time Speech-to-Text (STT) transcription

Open jacobtang opened this issue 1 year ago • 6 comments

The text data returned by this callback recorder.text(process_text) often contains repeated content or accumulates with a delay. Are there recommended reference values for the recorder_config parameter?Thanks. recorder = AudioToTextRecorder(**recorder_config)

jacobtang avatar Feb 24 '24 17:02 jacobtang

This sounds more like the behaviour of the on_realtime_transcription_update callback. Definitely should not occur with the default parameter set. My first guess would be you are maybe using the same callbacks for both the on_transcription_finished callback from the text method and the on_realtime_transcription_update callback from the AudioToTextRecorder constructor.

KoljaB avatar Feb 24 '24 17:02 KoljaB

Some updates on this. Former faster-whisper version prob caused this (got somehow corrupted on pypi), I think it was 0.6.0. Neuer versions are fine.

KoljaB avatar Apr 13 '24 11:04 KoljaB

Thanks! Should I use the 0.6.0 version of the faster-whisper instead of the latetest [v1.0.1]?(https://github.com/SYSTRAN/faster-whisper/releases/tag/v1.0.1) Or just update the latest faster-whisper / RealtimeSTT version?

jacobtang avatar Apr 14 '24 13:04 jacobtang

You can upgrade RealtimeSTT to newest version which uses latest faster-whisper 1.0.1 (this version is also referenced in the requirements file of RealtimeSTT) .

KoljaB avatar Apr 14 '24 14:04 KoljaB

great! Another question is the latest v0.1.15 of RealtimeSTT has the parameter beam_size, it can be use to reduce the delay?

jacobtang avatar Apr 16 '24 14:04 jacobtang

You trade-off accuracy vs speed: A larger beam_size yields better quality output because the model can explore more options and potentially avoid local minima in the search space. But also means slower performance because more sequences are evaluated at each step.

KoljaB avatar Apr 16 '24 15:04 KoljaB