ray [serve] vllm example to serve llm models

[serve] vllm example to serve llm models

Open can-anyscale opened this issue 9 months ago • 3 comments

Adds a documentation example using vLLM to serve LLM models on Ray Serve.

This is a copy of https://github.com/ray-project/ray/pull/45325 + add a build environment for ray serve + vllm.

Test:

May 18 '24 04:05 can-anyscale

@akshay-anyscale, @edoakes i managed to create an environment for the test to run but it fails for some other reasons https://buildkite.com/ray-project/microcheck/builds/237#018f8c35-e5a1-443d-8cf9-bbb481af6c1e/177-2429; if this makes sense feel free to change this pr, thankkks

May 18 '24 20:05 can-anyscale

@aslonnie intended to get merged, but will need serve folks to pick up and finish the job ;)

May 20 '24 13:05 can-anyscale

@akshay-anyscale, @edoakes i managed to create an environment for the test to run but it fails for some other reasons https://buildkite.com/ray-project/microcheck/builds/237#018f8c35-e5a1-443d-8cf9-bbb481af6c1e/177-2429; if this makes sense feel free to change this pr, thankkks

Pushed a commit to change the dtype, hopefully that fixes things.

May 20 '24 13:05 edoakes

Is ray-llm going to be deprecated and this example will be the recommended way to run vllm on Ray?

May 21 '24 08:05 carsonwang

ray ray copied to clipboard

[serve] vllm example to serve llm models

ray
ray copied to clipboard