[RFC] Add model API as the umbrella for model centric deployment
🚀 Feature Description and Motivation
As we discussed offline, we consider to provide a Model API object, designed to simplify and unify the deployment of models by managing all related configurations under one umbrella. Traditionally, users have had to manually set up components like HTTPRoute, HPA, initContainer, and other runtime specifics. This process requires careful coordination between user and infrastructure, adding complexity to deployment workflows.
With the Model API object, we aim to:
- Simplify deployment: Automatically manage all necessary components (such as routing, scaling, and runtime setup) through a single object, reducing user-side configuration overhead.
- Enable features seamlessly: Our umbrella object makes it easier to bring features like heterogenous, distributed serving, traffic routing, and runtime initialization to model-centric deployments with minimal user effort.
Use Case
No response
Proposed Solution
No response
I moved to v0.1.0 since potential discussion and changes may needed
Move to v0.3.0 due to limited resources
It is helpful. Will we have a design doc for this feature?