Cody Yu

Results 161 comments of Cody Yu

The update code seems not equivalent to the original one? Currently for non-hip and non-xpu cases we don't init Ray with all GPUs.

Sounds reasonable to me, but cc @rkooo567 @richardliaw to double check.

Overall LGTM. Also please add a unit test in `tests/entrypoints/test_chat_utils.py` Also cc @ywang96 @DarkLight1337

> LGTM but this may be reworked anyhow in https://github.com/vllm-project/vllm/pull/14048. Looks like a more sophisticated solution. I could close this one if #14048 is going to be landed soon

> @comaniac Could you please take a look? The PR removes a few lines of code in model loader that you marked as `FIXME`. That FIXME should be removed safely....

The message looks good to me. In addition, please also update other parts such as logging and dumped file name. Please search `request_rate` in this file and make sure `max_concurrency`...

> > The message looks good to me. In addition, please also update other parts such as logging and dumped file name. Please search `request_rate` in this file and make...

I'm a bit confused. The log you showed is more about `Engine in background thread`, which is not directly related to Ray 2.44. General speaking if you see warning like...

> Even after setting VLLM_USE_V1=1 , deploying vllm 0.8.2 with ray 2.44 , falls back to using V0 engine. I have not tested it with ray 2.43. > > Do...