faster-whisper icon indicating copy to clipboard operation
faster-whisper copied to clipboard

End of sequence token problem.

Open ngcheeyuan opened this issue 1 year ago • 1 comments

I was transcribing an audio file that was about 65 seconds long. However the model kept generating text until about 83s (based on time stamp).

Is this an issue with the 30s chunking and the last 5 seconds being padded to fill up the 30s?

Is there a way I can solve this (other than post processing cutting out text that exceeds the duration of the audio file) .

Thanks.

ngcheeyuan avatar Mar 01 '24 11:03 ngcheeyuan

@ngcheeyuan , hello. Can you attach your audio and show the transcription log ?

trungkienbkhn avatar Mar 05 '24 07:03 trungkienbkhn