faster-whisper
faster-whisper copied to clipboard
Shorter segments?
Would it be possible to produce shorter segments? (some are way too long)
There is no option that can effectively prevent this. The parameter length_penalty
can help to some extent but it will not force the model to predict a shorter segment.
Do you get a different output with openai/whisper? If yes, it would be great if you can provide a way to reproduce the output.
There's been discussions in openai/whisper where you could skew the model to output shorter segments by tweaking max_text_token_logprob
: https://github.com/openai/whisper/discussions/435#discussioncomment-4010615
Is something similar with the codebase in faster-whisper?
I just saw the addition of length_penalty
today. How should it be used? Its default value is set to 1.
@guillaumekln from my testing, I've also had great results using the token_timestamps
flag here
Tbh, I don't know what CTranslate2 does to the underlying model, and if such capabilities are lost because the model was transformed.
At this time we did not implement any features or parameters that are not available in the reference implementation from openai/whisper. So currently there are no easy ways for users to tweak max_text_token_logprob
or enable token-level timestamps, which would require changes to the C++ implementation in CTranslate2.
Regarding word-level timestamps, I'm following this development in the openai/whisper repo. If it is merged, I will look to support it here as well.
Also, you can ignore my comment regarding length_penalty
. It is not relevant to your issue since you want the model to output more timestamps and not make the generated sequences shorter.
I just merged the word-level timestamps branch so the segments can now be as short as you want.
hi @guillaumekln do you mind explaining what you mean by "I just merged the word-level timestamps branch so the segments can now be as short as you want."?
How do we control their length now?
And why a couple of months after this reply you said here https://github.com/SYSTRAN/faster-whisper/issues/452#issuecomment-1704859269 that "There is no option to control the segment length."?