Patrick von Platen

Results 1228 comments of Patrick von Platen

Please read: https://github.com/huggingface/distil-whisper/issues/26#issuecomment-1805643512

Would be cool to start a new distillation run for Whisper-large-v3 indeed! Let's see if we find some compute

We mainly trained on TPUv4's here. @sanchit-gandhi will know best what hardware is needed I believe :-)

The cross attention head dimensions should be **exactly** the same as the corresponding teacher models (which are whisper-large-v2 for distil-whisper-32-2 and whisper-medium.en for distil-whisper-24-2)

> @sanchit-gandhi Sincerely thank you for your reply. What I want to know is,how to deal with beamsize >1 in speculative decoding?When draft model generated 4 beams, for example, and...

Also see this issue: https://github.com/huggingface/distil-whisper/issues/11

@souvikqb, please open a new issue as this question is not related to `beamsize`

Wow amazing work here @isamu-isozaki! cc'ing @patil-suraj here as well

It should be RGB format, see example here: https://huggingface.co/lllyasviel/sd-controlnet-canny#example

Happy to review a PR!