Roger Wang

Results 132 comments of Roger Wang

Closing this as we merged https://github.com/vllm-project/vllm/pull/5237

> ### UPDATE on 2024-05-23 > Workaround: Use the `--disable-custom-all-reduce` flag when starting the vLLM instance. Thanks @ywang96 ! @itechbear Glad that this does resolve your issue - I suspect...

> > In general it is very difficult to clean up all resources correctly, especially when we use multiple GPUs, and might be prone to [deadlocks](https://github.com/vllm-project/vllm/pull/4508#issuecomment-2087794774) . > > I...

We should probably get this merged - I will review it, thanks for the ping @jeejeelee

> Is MiniCPM-Llama3-V 2.5 supported? I tried it and it doesn't seem to work. I used https://github.com/OpenBMB/vllm. Not yet, @DarkLight1337 and I have been working on the refactoring and we're...

Closing this as we merged https://github.com/vllm-project/vllm/pull/5237

I'm working on a PR for this currently. See #5189

cc @DarkLight1337 @Isotr0py @alsichcan

cc @robertgshaw2-neuralmagic @mgoin (since NM's planned to work on whisper) Thank you all for the feedback so far! I plan to address feedback altogether after meeting up with the core...

I discussed with @zhuohan123 offline about this - in particular regarding this comment > To avoid having to modify the core Engine logic each time, we can wrap the data...