data-on-eks icon indicating copy to clipboard operation
data-on-eks copied to clipboard

feat: Inference using vLLM with RayServe on Inf2

Open ratnopam opened this issue 1 year ago • 0 comments

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

- Create a new blueprint to showcase how to use vLLM with RayServe on AWS Inf2
- Deploy any LLM model as inference example

Describe the solution you would like

Create a pattern that enables serving inference using Ray with vLLM backend on Inf2. Create Website doc with step-by-step instructions for deployment and testing of the pattern.

Describe alternatives you have considered

Additional context

https://awsdocs-neuron.readthedocs-hosted.com/en/latest/index.html

ratnopam avatar Jul 17 '24 00:07 ratnopam