CTranslate2 Whisper - Correct way to get prediction probability of each token and timestamp alignment

Whisper - Correct way to get prediction probability of each token and timestamp alignment

Open huydang2106 opened this issue 7 months ago • 0 comments

I have spent time looking at the documentation but did not manage to find proper way to get the prediction probabilities of all tokens. Also, how can i get the time-token alignment output from the model, with just the audio input features as input. I did see the align function, but the function requires input features and the input text tokens - which does not seem to meet my need.

Dec 06 '23 10:12 huydang2106

CTranslate2 CTranslate2 copied to clipboard

Whisper - Correct way to get prediction probability of each token and timestamp alignment

CTranslate2
CTranslate2 copied to clipboard