Jiaxin Shan comments

Results 742 comments of


                                            Jiaxin Shan

[Umbrella] Add webhook for validation

@kerthcet v0.3.0 will be rollout no later than mid May. We can leave some tasks to v0.4.0 release. If there're some tasks you feel are necessary to finish before v0.3.0...

[Umbrella] Add webhook for validation

in that case, let's assume user deploy 2 controllers. do they need 2 controller (separate deployment) + 1 webhook server? or 2 * (1 controller + 1 webhook server)

[Umbrella] Add webhook for validation

@kerthcet Sounds good. do we plan to add further improves into v0.3.0? The proposed cut off plan is next Friday.

[Umbrella] Add webhook for validation

In v0.3.0, we already have webhook framework supported, for workload type validation etc, let's move to v0.4.0

[Umbrella] Add webhook for validation

@kerthcet Standalone installation by default uses `--disable-webhook` option at this moment, https://github.com/vllm-project/aibrix/blob/402c62c2bb32da951ecaa13a25176f9fbe72c5d7/config/standalone/kv-cache-controller/patch.yaml#L17 We can switch to `enabled` but need to change the manifests and handle some potential naming conflicts. This...

Support different lora adapter artifact registry

There're two options. 1. Make the support in engine side. we pass everything into the inference engine 2. Runtime should pick it up and download it. we need to change...

Support different lora adapter artifact registry

``` // ArtifactURL is the address of the model artifact to be downloaded. Different protocol is supported like s3,gcs,huggingface // +kubebuilder:validation:Required ArtifactURL string `json:"artifactURL,omitempty"` // CredentialsSecretRef points to the secret...