Roger Wang comments

Results 132 comments of


                                            Roger Wang

[Usage]: Make request to LLAVA server.

Closing this as we merged https://github.com/vllm-project/vllm/pull/5237

[Bug]: Engine iteration timed out. This should never happen!

> ### UPDATE on 2024-05-23 > Workaround: Use the `--disable-custom-all-reduce` flag when starting the vLLM instance. Thanks @ywang96 ! @itechbear Glad that this does resolve your issue - I suspect...

Is there a way to terminate vllm.LLM and release the GPU memory

> > In general it is very difficult to clean up all resources correctly, especially when we use multiple GPUs, and might be prone to [deadlocks](https://github.com/vllm-project/vllm/pull/4508#issuecomment-2087794774) . > > I...

[Model] Adding support for MiniCPM-V

We should probably get this merged - I will review it, thanks for the ping @jeejeelee

[Model] Adding support for MiniCPM-V

> Is MiniCPM-Llama3-V 2.5 supported? I tried it and it doesn't seem to work. I used https://github.com/OpenBMB/vllm. Not yet, @DarkLight1337 and I have been working on the refactoring and we're...

[Usage]: Passing image to the vllm api endpoint

Closing this as we merged https://github.com/vllm-project/vllm/pull/5237

[New Model]: Google's Paligemma family of models

I'm working on a PR for this currently. See #5189

[RFC]: Multi-modality Support on vLLM

cc @DarkLight1337 @Isotr0py @alsichcan

[RFC]: Multi-modality Support on vLLM

cc @robertgshaw2-neuralmagic @mgoin (since NM's planned to work on whisper) Thank you all for the feedback so far! I plan to address feedback altogether after meeting up with the core...

[RFC]: Multi-modality Support on vLLM

I discussed with @zhuohan123 offline about this - in particular regarding this comment > To avoid having to modify the core Engine logic each time, we can wrap the data...