Jiaxin Shan
Jiaxin Shan
### 🚀 Feature Description and Motivation - Option 1: zap (native solution), good for structure logging - Option 2: klogr + klog - most simplest solution with minimum changes no...
### 🚀 Feature Description and Motivation Kubebuilder internally uses kustomize to manage installations. However, many users prefer using Helm for managing Kubernetes manifests. Currently, Kubebuilder does not support direct generation...
We meet a few cases that single deployment needs to be deployed across different chips due to quota or resource shortage. However, in Kubernetes, most of the time we use...
### 🚀 Feature Description and Motivation We already define a few autoscaling evaluation metrics like provision efficiency, SLO violations, resource usage etc. If would be great for controller to evaluate...
### 🚀 Feature Description and Motivation This is a follow up of https://github.com/aibrix/aibrix/issues/419. We want to have more elegant implementation to disable logs from `/health` and `/metrics` in vLLM. I...
### 🐛 Describe the bug in this case, if you do not commit the change, the docker tag would be always same. Sometimes, it's not that easy to debug when...
### 🚀 Feature Description and Motivation For the 33b model deployment, we have a few options, A10, V100-32GiB, L20, L40. Technically, we can launch the instance using M * N...
### 🚀 Feature Description and Motivation Currently, runtime picks up the work to download the model weights. If we have another replica wants to be deployed, one option is to...
### 🚀 Feature Description and Motivation cache locality can be leveraged to reduce model startup time. As user uses up to 128 rank which is kind of large, this feature...
### 🚀 Feature Description and Motivation ## Background Different requests have varying input/output lengths, leading to diverse resource requirements. Currently, when a batch of requests gets scheduled together, it is...