Jiaxin Shan issues

Results 271 issues of


                                            Jiaxin Shan

[RFC]: Make API Gateway interface OpenAI compatible

### Summary This RFC proposes making the API Gateway interface within AIBrix compatible with OpenAI. We met few issues in the past few days. and @gaocegege also suggest https://github.com/vllm-project/aibrix/issues/732 earlier...

kind/enhancement

area/gateway

priority/critical-urgent

[Observation] Improve AIBrix control plane monitoring

### 🚀 Feature Description and Motivation AIBrix, which is composed of multiple controllers, currently lack of comprehensive monitoring makes it difficult to effectively manage and troubleshoot the system. We at...

priority/important-longterm

kind/feature

area/stability

Some prompts with special character fail the benchmark script

### 🐛 Describe the bug This is a follow up issue of https://github.com/vllm-project/aibrix/issues/783 Seems some specific prompt data will fail the test ``` python benchmark_serving.py --backend vllm --model deepseek-ai/deepseek-r1 --trust-remote-code...

kind/bug

priority/important-soon

area/benchmark

RayClusterFleet controllers shows some reconcilation issues

### 🐛 Describe the bug ![Image](https://github.com/user-attachments/assets/75742fa7-37f6-4ca1-b03d-0de70f2632a1) ### Steps to Reproduce use deepseek-r1-local-volume version ### Expected behavior it should create the pods successfully ### Environment commit: 1f96e9a47aab42cf003ed4bd031701eea754332c24b6b10c691ee5a842eeceb1

kind/bug

area/distributed

Move the benchmark codes to aibrix python package

### 🚀 Feature Description and Motivation We can consider to make code reusable and move to https://github.com/vllm-project/aibrix/tree/main/python/aibrix ![Image](https://github.com/user-attachments/assets/06999554-6f59-48da-a43d-5ff1af9fbc71) This make the generator/benchmark code reusable. ### Use Case As a user,...

kind/feature

area/benchmark

area/performance

v0.4.0 roadmap

### 🚀 Feature Description and Motivation We’re actively evolving AIBrix to support more advanced and production-ready LLM serving capabilities. For v0.4.0 and beyond, our roadmap includes: - Prefill & Decode...

kind/documentation

priority/important-soon

upstream connect error or disconnect/reset before headers

### 🐛 Describe the bug ![Image](https://github.com/user-attachments/assets/aca3c823-507f-4dde-9d04-0adee8f8c2e8) Can we list potential reasons for connection issues. In recently benchmark testing, I see a few similar cases. ### Steps to Reproduce Run benchmark,...

[RFC]: Add Support for Prefill/Decode (P/D) Disaggregation in vLLM

### Summary To further optimize large-scale LLM inference workloads, we plan to introduce support for Prefill/Decode (P/D) disaggregation in vLLM. This separation allows prefill and decode stages to run on...

kind/enhancement

area/gateway

priority/critical-urgent

area/disaggregated

[CICD] Optimize kv cache image size

### 🚀 Feature Description and Motivation ![Image](https://github.com/user-attachments/assets/d4b2c067-ce70-4e62-97e1-775484ea8ac1) The image size is super large now, we need to reduce the size a little bit. ``` FROM ubuntu:22.04 RUN apt-get update &&...

help wanted

priority/important-longterm

area/cicd

area/installation

area/kv-cache

[RFC]: Adapt KVCache Offloading Framework for vLLM v1 Architecture

### Summary The current KVCache offloading framework is built around assumptions from the vLLM v0 architecture. With the release of vLLM v1, which introduces new cache handling semantics, especially the...

priority/important-soon

area/distributed

area/kv-cache