whisper.cpp
whisper.cpp copied to clipboard
Zero-pad the audio, not the spectrogram
Makes sense - hopefully will reduce hallucinations
https://github.com/openai/whisper/discussions/838#discussioncomment-5222689
(Removed previous incorrect assumption about the featurization).
As long as you don't change the featurization, it looks like you can just switch your padding to -1.5
if it's more convenient to keep padding at the spectrogram level.
It seems like your speed_up could interfere with the padding otherwise
Oh this might be a game-changer, I wasn't able to find a way to reduce them so far.
This was merged into Whisper as of 20230307, is there a chance we'll see it in whisper.cpp soon?