ray icon indicating copy to clipboard operation
ray copied to clipboard

Add example of serving LLM with Ray Serve and vLLM

Open akshay-anyscale opened this issue 1 year ago • 4 comments

Why are these changes needed?

Adds a documentation example using vLLM to serve LLM models on Ray Serve.

Related issue number

Checks

  • [ ] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • [ ] I've run scripts/format.sh to lint the changes in this PR.
  • [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
    • [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in doc/source/tune/api/ under the corresponding .rst file.
  • [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • [ ] Unit tests
    • [ ] Release tests
    • [ ] This PR is not tested :(

akshay-anyscale avatar May 14 '24 07:05 akshay-anyscale

Ready to merge. pending @aslonnie 's approval

akshay-anyscale avatar May 17 '24 00:05 akshay-anyscale

lint is failing; i just added lint to microcheck yesterday ;), sorry for not catching this sooner

can-anyscale avatar May 17 '24 13:05 can-anyscale

pushed a commit to fix linter

edoakes avatar May 17 '24 13:05 edoakes

Am I right in assuming that the prometheus metrics that vllm exports will just automatically get propagated to the ray serve metrics endpoint? (provided i use the log_stats when starting my vllm engine)

kousun12 avatar May 18 '24 15:05 kousun12