afeldman-nm
afeldman-nm
FYI to reviewer - my PR is failing the buildkite/ci/pr/amd-distributed-tests test, with what appears to be a HuggingFace issue: =========================== short test summary info ============================ FAILED distributed/test_chunked_prefill_distributed.py::test_models[16-5-half-meta-llama/Llama-2-7b-hf] - OSError: You...
Thanks @js8544 ! Taking a look
@js8544 please review this PR against your feature branch https://github.com/js8544/vllm/pull/1 it adds a t5 encoder/decoder example file, and also finishes merging upstream main into your PR.
FYI I think this PR has some conflicts with recent changes to the main branch. I am looking at resolving them. This PR was previously passing all of the tests...
Hello @Abineshik yes things are moving apace. Thanks for checking in. I determined it is probably for the best for encoder decoder models to have separate blocktables for self- and...
> @afeldman-nm how is the change you are working on going? Work is still ongoing but hope to finish soon!
@zhuohan123 I am working on Whisper support.
@dbogunowicz thanks for your work on Whisper! Since there is clearly interest in this feature and its completion timeline, I want to add the context that Whisper support takes a...
See the encoder/decoder support issue (https://github.com/vllm-project/vllm/issues/187) and new PR (https://github.com/vllm-project/vllm/pull/4289) for a status update on encoder/decoder support, which is a prereq for Whisper support.
> Hi, any update on serving faster-whisper via VLLM? Hi @twicer-is-coder , Whisper (or any variant thereof) is high of the list of models to add once infrastructure support is...