aibrix
aibrix copied to clipboard
Documented example for autoscaling a multinode deployment
🚀 Feature Description and Motivation
I can see in the documentation information for how to use autoscaling and how to do multi-node deployments, but not on both at the same time. The reason I ask for this is that multi-node deployments require scaling of both the head and worker nodes, and I am unsure if this affects how the autoscaler needs to be setup.
The closest to a documented example is in an issue.
Does anyone have a working example of a metrics-based autoscaling multi-node deployment?
Use Case
Having a scalable multi-node deployment would help optimise cost and help handle spikes on demands
Proposed Solution
No response
Isn't this issue enough of an example? 🤔 https://github.com/vllm-project/aibrix/issues/986