Jiaxin Shan

Results 742 comments of Jiaxin Shan

I can launch the cluster successfully but won't be able to move to next steps. ![image](https://github.com/user-attachments/assets/3f7b869e-20b5-46aa-bb18-dbc4eac77acd) ![image](https://github.com/user-attachments/assets/7d77145b-0f05-41a2-ab35-4619528d5928) --- Seem still my setting issue, I clean it up and rerun the...

@jolfr Everything works perfect. thanks for the contribution. this is really awesome!

@ying2025 orchestration part, We talked with Anyscale ray team to add ray backend support for SGLang, if that doesn't go well, we can also use cloud native way to orchestrate...

@ying2025 if you do not use specific features like lora, you can easily replace vLLM image with SGLang image. AIBrix should be compatible with it. If you have some bandwidth,...

@ying2025 We plan to support sglang in v0.4.0 (including metrics, routing and P/D disaggregation). Let's use https://github.com/vllm-project/aibrix/issues/843 to track it. i will close this story.

@libin817927 P/D disaggregation involves many components like orchestration, proxy routing, connector relied components etc. We plan to provide an easy to use solutions and that's why we didn't say it's...

@LuxePlay We plan to extend support to multi-modality models, could you give more details on the use case "语音转录"? BTW, for "Z2 Mini G1a Workstation", does it come with GPU?...

@TianTengya Yes. P&D is not the focus, we are busy with kv cache solutions and plan to fully unblock prefix-cache scenarios first. the next step would be xPyD. I will...

@libin817927 We need to support a better routing solution to balance P and D traffic. otherwise, it's hard to show the benefits. In addition, there're two different P/D approaches, either...

We already kick off the work and this is a top priority item in v0.4.0.