aibrix
aibrix copied to clipboard
Implement model architect aware scheduling policies
🚀 Feature Description and Motivation
Currently, runtime picks up the work to download the model weights. If we have another replica wants to be deployed, one option is to be scheduled to the same node already has weights. In this case, it's would be great that we can contribute some scheduler plugin to be aware of the artifacts.
Use Case
No response
Proposed Solution
No response
It should be done in cold start manager or some other reusable component.