Jiaxin Shan issues

Results 271 issues of


                                            Jiaxin Shan

[batch] The constant shows DEFAULT_JOB_POOL_SIZE = 1 this should be configurable.

area/batch

[batch] Currently only /v1/chat/completions is tested. We need to expand to `/v1/completions` or `/v1/embeddings` and `/v1/moderation` at least

area/batch

batch api further improvements tracking issue

### 🐛 Describe the bug there're few issues after reviewing and testing the latest batch API. ### Steps to Reproduce listed above ### Expected behavior N/A ### Environment nightly

area/batch

Ensure aibrix_kvcache Torch version stays compatible with latest vLLM/SGLang releases

### 🚀 Feature Description and Motivation We’ve hit issues where the aibrix_kvcache Python package cannot be installed or used together with the latest vLLM/SGLang versions due to Torch version incompatibilities....

WIP: [Feat] Support StormService pause rollout in upgrade

## Pull Request Description [Please provide a clear and concise description of your changes here] ## Related Issues Resolves: #[Insert issue number(s)] **Important: Before submitting, please complete the description above...

Support storm service roleset and roles in AIBrix routing cache

### 🚀 Feature Description and Motivation Currently, we only support pod in router cache. Even for multiple node setups, we only fetch `ray-head` pod instead. Now, we introduce storm services...

area/gateway

priority/critical-urgent

streaming mode doesn't work for in-house engine

### 🐛 Describe the bug ``` {"id": "chatcmpl-1753420585601835809", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "", "reasoning_content": "", "tool_calls": []}, "finish_reason": "stop"}], "created": 1753420585, "model": "deepseek-r1", "system_fingerprint": "fp", "object": "chat.completion.chunk",...

area/gateway

area/inference-engine

port in annotation is not picked by gateway pod

### 🚀 Feature Description and Motivation ``` labels: model.aibrix.ai/name: qwen3-8B model.aibrix.ai/port: "30000" model.aibrix.ai/engine: sglang spec: nodeSelector: kubernetes.io/hostname: 192.168.0.6 containers: - name: decode image: kvcache-container-image-hb2-cn-beijing.cr.volces.com/aibrix/sglang:v0.4.9.post3-cu126-nixl-v0.4.1 command: ["sh", "-c"] args: - |...

Support HTTPRoute for StormService

### 🚀 Feature Description and Motivation ``` curl http://localhost:8888/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-r1", "stream": false, "messages": [ {"role": "user", "content": "1111"} ] }' ``` gateway...

area/gateway

priority/critical-urgent

Support Mooncake P/D conductor algorithm in AIBrix router

### 🚀 Feature Description and Motivation In mooncake's paper https://www.usenix.org/system/files/fast25-qin.pdf, chapter 4 talks about prefill and load aware scheduling algorithms, Let's put some efforts here to reproduce this paper and...

area/gateway

priority/important-soon