[Umbrella] Add webhook for validation
🚀 Feature Description and Motivation
Webhook is used for CRD validations, and will fail fast compared to runtime validation.
Use Case
Once CRD is not right configured, fail fast.
- [x] webhook framework
- [ ] add integration tests to CI && separate with E2E tests
- [ ] ModelAdapter
- [ ] PodAutoscaler
- [ ] KVCache
- [ ] RayclusterFleet
- [ ] RayclusterReplicaset
Proposed Solution
No response
/assign
We can discuss more details on the webhook usage. In the examples, we just use huggingface models for simplicity. However, in real world, most users has to fetch weights from S3 like object storage.
The challenge at this moment is AIBRix doesn't have any orchestration support to hide those details like llamaz or kubeAI. As a mid term solution. I am thinking whether we can leverage webhook to convert more model configuration from annotations to specs fields like inject sidecar container for model downloading etc. that will fill the gap of missing model orchestration.
@kerthcet v0.3.0 will be rollout no later than mid May. We can leave some tasks to v0.4.0 release. If there're some tasks you feel are necessary to finish before v0.3.0 release. Please comment here.
there're one requirement I'd like to discuss with you here. Due to integration complexity, some users prefer the standalone deployment. https://aibrix.readthedocs.io/latest/getting_started/installation/installation.html#install-individual-aibrix-components. In this case, they just want an individual controller. Once we introduce the webhook validation, we probably won't deployment webhook along with each controller. We still like to do some basic validation for those cases. what's your thoughts for this case?
Webhook + controller is still a standalone solution, it requires no additional effort, what's their concern here? Or we can use CEL which is build-in the apiserver, but I have no idea whether this meets all of our requirements.
in that case, let's assume user deploy 2 controllers. do they need 2 controller (separate deployment) + 1 webhook server? or 2 * (1 controller + 1 webhook server)
Webhook is deployed together with controller, that's say we just need 2 * (controller + webhook).
@kerthcet Sounds good. do we plan to add further improves into v0.3.0? The proposed cut off plan is next Friday.
In v0.3.0, we already have webhook framework supported, for workload type validation etc, let's move to v0.4.0
For standalone installation, what's our plan now?
@kerthcet Standalone installation by default uses --disable-webhook option at this moment, https://github.com/vllm-project/aibrix/blob/402c62c2bb32da951ecaa13a25176f9fbe72c5d7/config/standalone/kv-cache-controller/patch.yaml#L17
We can switch to enabled but need to change the manifests and handle some potential naming conflicts. This is a TODO item