vllm icon indicating copy to clipboard operation
vllm copied to clipboard

Whisper support

Open gottlike opened this issue 1 year ago • 31 comments

Is support for Whisper on the roadmap? Something like https://github.com/ggerganov/whisper.cpp would be great.

gottlike avatar Jun 21 '23 07:06 gottlike

Supporting encoder-decoder models is in our roadmap as mentioned in #187. Feel free to join the discussion and potentially contribute!

zhuohan123 avatar Jun 21 '23 14:06 zhuohan123

+1 for this feature

libratiger avatar Sep 14 '23 09:09 libratiger

+2 for this feature

silvacarl2 avatar Sep 23 '23 14:09 silvacarl2

+3 for this feature

xtqxk avatar Oct 24 '23 03:10 xtqxk

+4 for this feature

arun2728 avatar Dec 01 '23 04:12 arun2728

+555

SinanAkkoyun avatar Dec 15 '23 08:12 SinanAkkoyun

+1

Swiffers avatar Jan 02 '24 18:01 Swiffers

+1

hahazei avatar Feb 26 '24 09:02 hahazei

monitoring

binarycrayon avatar Feb 26 '24 21:02 binarycrayon

@zhuohan123 I am working on Whisper support.

afeldman-nm avatar Feb 28 '24 20:02 afeldman-nm

NO WAY!!!!!!!!!!!!!!!!!!! THAT WILL BE AWESOME!!!!!!!!!!!!!!!!!!!!!

silvacarl2 avatar Feb 28 '24 20:02 silvacarl2

I am working on this PR, and will soon submit the draft.

libratiger avatar Mar 04 '24 02:03 libratiger

THIS IS GOING TO BE HUGE, THX!

silvacarl2 avatar Mar 04 '24 16:03 silvacarl2

Hey @libratiger, together with @afeldman-nm I am now working full-time on the same target. Would you like to sync? It would be more efficient to share knowledge, rather than develop the same thing in two silos.

dbogunowicz avatar Mar 12 '24 15:03 dbogunowicz

You're right. I've just discovered a discussion about T5 https://github.com/vllm-project/vllm/issues/187#issuecomment-1825244021 , where there are differing opinions on the encoder-decoder model. Perhaps it will improve after that PR is merged?

libratiger avatar Mar 13 '24 02:03 libratiger

@libratiger the current status is as follows: neural magic has finalized the original T5 PR, and we are now benchmarking the solution. In parallel, we are also developing support for Whisperer.

dbogunowicz avatar Mar 13 '24 12:03 dbogunowicz

@dbogunowicz any update on this issue? looking forward

JackZeng avatar Mar 28 '24 08:03 JackZeng

Hi! I am working on the Whisper on our team fork: https://github.com/neuralmagic/nm-vllm/pull/147 The status is: I am running the inference (both prompt prefill as well as autoregressive inference), but I get correctness issues, most likely caused by the erroneous attention mask implementation.

dbogunowicz avatar Mar 28 '24 13:03 dbogunowicz

@dbogunowicz I ran the feature/demian/Whisper branch to run the Whisper model and found an error message: vllm/worker/model_runner. py, line 477, in prepare_decode Multi_modeal_input) NameError: name 'multi_modal_input' is not defined, code execution cannot start

junior-zsy avatar Apr 02 '24 10:04 junior-zsy

@junior-zsy fixed for now. Please remember, that we are still working on that PR, so it's pretty much in WiP state. Let me explicitly set the appropriate PR flag.

dbogunowicz avatar Apr 02 '24 12:04 dbogunowicz

@dbogunowicz Ok, thank you. Hope it can be used soon

junior-zsy avatar Apr 03 '24 02:04 junior-zsy

same here, this is going to be really cool!

silvacarl2 avatar Apr 03 '24 13:04 silvacarl2

@dbogunowicz thanks for your work on Whisper! Since there is clearly interest in this feature and its completion timeline, I want to add the context that Whisper support takes a dependency on encoder/decoder support -

Issue: https://github.com/vllm-project/vllm/issues/187 PR: https://github.com/vllm-project/vllm/pull/3117

which is also WIP (currently works partially but is not quite complete.) I expect to complete encoder/decoder support soon. JFYI for anyone interested in timelines.

afeldman-nm avatar Apr 03 '24 14:04 afeldman-nm

+1

dwoodworth90 avatar Apr 26 '24 08:04 dwoodworth90

See the encoder/decoder support issue (https://github.com/vllm-project/vllm/issues/187) and new PR (https://github.com/vllm-project/vllm/pull/4289) for a status update on encoder/decoder support, which is a prereq for Whisper support.

afeldman-nm avatar Apr 30 '24 13:04 afeldman-nm

Hi, any update on serving faster-whisper via VLLM?

twicer-is-coder avatar May 21 '24 09:05 twicer-is-coder

Hi, any update on serving faster-whisper via VLLM?

Hi @twicer-is-coder ,

Whisper (or any variant thereof) is high of the list of models to add once infrastructure support is in; you can see the roadmap for infrastructure support in this PR:

https://github.com/vllm-project/vllm/pull/4942

afeldman-nm avatar May 23 '24 17:05 afeldman-nm

FYI, encoder decoder support landed in #4942 and there is an RFC ( #7366 ) for follow-on encoder/decoder-related tasks, including adding Whisper support; feedback period is until August 16th. See https://github.com/vllm-project/vllm/issues/187#issuecomment-2278777339

afeldman-nm avatar Aug 09 '24 21:08 afeldman-nm

are you kidding me? is whisper supported now by vllm?

silvacarl2 avatar Aug 09 '24 21:08 silvacarl2

are you kidding me? is whisper supported now by vllm?

Adding Whisper support will hopefully follow shortly now that we have the encoder/decoder infrastructure landed. This is part of the RFC.

afeldman-nm avatar Aug 09 '24 21:08 afeldman-nm