Roger Wang
Roger Wang
I can take a first pass too whenever it's ready if @robertgshaw2-neuralmagic doesn't get there before me :)
Hey @sangstar ! Thank you for the contribution - I will take a look at this PR next week and test it out!
~~Hey @piercefreeman! AFAIK `tensorizer` shouldn't be supported when TP > 1.~~ ~~@sangstar Could you take a look at this issue?~~ Edit: Doesn't look like this is related to `tensorizer`.
Will take a look once I have some bandwidth - thanks for the continuous contribution to vLLM!
FYI - this might have something to do with the custom all reduce operation. We have observed this same issue but it went away after specifying `--disable-custom-all-reduce` when launching the...
Closing this since Llava1.5 (or a general vision language framework) has been already added in https://github.com/vllm-project/vllm/pull/3042. We will continue working on supporting other models at our best effort, but any...
Hey @DarkLight1337! Sorry I've been a bit busy lately, but I will surely take a look in the upcoming week! Apologies for the delay!
Per offline discussion - waiting for #5118 to be merged first.
@DarkLight1337 Could you resolve the merge conflicts? Once that's done I think this PR is ready to merge.
@DarkLight1337 I just went though this PR again and made a change to move offline API reference to under developer doc. #4710 was a great addition, but I think we...