Roger Wang comments

Results 131 comments of


                                            Roger Wang

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API

I can take a first pass too whenever it's ready if @robertgshaw2-neuralmagic doesn't get there before me :)

[Frontend] [Core] feat: Add model loading using `tensorizer`

Hey @sangstar ! Thank you for the contribution - I will take a look at this PR next week and test it out!

[Bug]: vllm stall on llama3-70b warmup with 0.4.1

~~Hey @piercefreeman! AFAIK `tensorizer` shouldn't be supported when TP > 1.~~ ~~@sangstar Could you take a look at this issue?~~ Edit: Doesn't look like this is related to `tensorizer`.

[Frontend] [Core] perf: Automatically detect vLLM-tensorized model, update `tensorizer` to version 2.9.0

Will take a look once I have some bandwidth - thanks for the continuous contribution to vLLM!

[Bug]: Engine iteration timed out. This should never happen!

FYI - this might have something to do with the custom all reduce operation. We have observed this same issue but it went away after specifying `--disable-custom-all-reduce` when launching the...

[Question] Usage with Multimodal LLM

Closing this since Llava1.5 (or a general vision language framework) has been already added in https://github.com/vllm-project/vllm/pull/3042. We will continue working on supporting other models at our best effort, but any...

[Core] Consolidate prompt arguments to LLM engines

Hey @DarkLight1337! Sorry I've been a bit busy lately, but I will surely take a look in the upcoming week! Apologies for the delay!

[Core] Support image processor

Per offline discussion - waiting for #5118 to be merged first.

[Core] Support image processor

@DarkLight1337 Could you resolve the merge conflicts? Once that's done I think this PR is ready to merge.

[Core] Consolidate prompt arguments to LLM engines

@DarkLight1337 I just went though this PR again and made a change to move offline API reference to under developer doc. #4710 was a great addition, but I think we...