Max Bain
Max Bain
I was recently transcribing a youtube video of several minutes when I noticed the output transcription of _whisper small.en_ looks something like this: [02:01.980 --> 02:04.060] We are GUYS! [02:04.300...
Currently arabic numerals and symbols in whisper transcript cannot be aligned, needs to be phonetic alphabet. Need to perform inverse of normalization in https://github.com/m-bain/whisperX/blob/main/whisperx/normalizers/english.py Such that numbers and currencies are...
### Feature request Whisper speech recognition without conditioning on previous text. As in https://github.com/openai/whisper/blob/7858aa9c08d98f75575035ecd6481f462d66ca27/whisper/transcribe.py#L278 ### Motivation Whisper implementation is great however conditioning the decoding on previous text can cause significant...
# What does this PR do? Add whisper functionality to decode *without* conditioning on previous text. https://github.com/openai/whisper/blob/7858aa9c08d98f75575035ecd6481f462d66ca27/whisper/transcribe.py#L278 When condition_on_prev_text=False (in the config), the following happens: Decoder attention is masked from...
Hi there, Just wondering if you're planning on releasing the hyperparameters for other datasets soon? Particularly CUB. Cheers, Max