Jiaxin Shan
Jiaxin Shan
### 🐛 Describe the bug there're few issues after reviewing and testing the latest batch API. ### Steps to Reproduce listed above ### Expected behavior N/A ### Environment nightly
### 🚀 Feature Description and Motivation We’ve hit issues where the aibrix_kvcache Python package cannot be installed or used together with the latest vLLM/SGLang versions due to Torch version incompatibilities....
## Pull Request Description [Please provide a clear and concise description of your changes here] ## Related Issues Resolves: #[Insert issue number(s)] **Important: Before submitting, please complete the description above...
### 🚀 Feature Description and Motivation Currently, we only support pod in router cache. Even for multiple node setups, we only fetch `ray-head` pod instead. Now, we introduce storm services...
### 🐛 Describe the bug ``` {"id": "chatcmpl-1753420585601835809", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "", "reasoning_content": "", "tool_calls": []}, "finish_reason": "stop"}], "created": 1753420585, "model": "deepseek-r1", "system_fingerprint": "fp", "object": "chat.completion.chunk",...
### 🚀 Feature Description and Motivation ``` labels: model.aibrix.ai/name: qwen3-8B model.aibrix.ai/port: "30000" model.aibrix.ai/engine: sglang spec: nodeSelector: kubernetes.io/hostname: 192.168.0.6 containers: - name: decode image: kvcache-container-image-hb2-cn-beijing.cr.volces.com/aibrix/sglang:v0.4.9.post3-cu126-nixl-v0.4.1 command: ["sh", "-c"] args: - |...
### 🚀 Feature Description and Motivation ``` curl http://localhost:8888/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-r1", "stream": false, "messages": [ {"role": "user", "content": "1111"} ] }' ``` gateway...
### 🚀 Feature Description and Motivation In mooncake's paper https://www.usenix.org/system/files/fast25-qin.pdf, chapter 4 talks about prefill and load aware scheduling algorithms, Let's put some efforts here to reproduce this paper and...