data-on-eks
data-on-eks copied to clipboard
feat: Add blueprint for using rayserve with vLLM on Inferentia2
What does this PR do?
Adds capability to deploy LLMs for inference on AWS Inferentia with ray and vLLM 🛑 Please open an issue first to discuss any significant work and flesh out details/direction - we would hate for your time to be wasted. Consult the CONTRIBUTING guide for submitting pull-requests.
Motivation
#591
More
- [] Yes, I have tested the PR using my local account setup (Provide any test evidence report under Additional Notes)
- [] Mandatory for new blueprints. Yes, I have added a example to support my blueprint PR
- [ ] Mandatory for new blueprints. Yes, I have updated the
website/docsorwebsite/blogsection for this feature - [ ] Yes, I ran
pre-commit run -awith this PR. Link for installing pre-commit locally
For Moderators
- [ ] E2E Test successfully complete before merge?