Jiaxin Shan

Results 742 comments of Jiaxin Shan

@sadath-12 sounds good. We will ask @Xunzhuo 's help on this issue

We need to fix these issues in v0.5.0

https://github.com/vllm-project/aibrix/pull/1698#discussion_r2462650519 good suggestion

https://github.com/vllm-project/aibrix/pull/1700#discussion_r2462669774

@zhengkezhou1 this is a great idea. that's would be helpful for performance related testing

I will move this issue to future release. currently, it can detect the role but doesn't support hierarchy

there's one community user asking this feature, he want to request to `p0d0` or `p1d1` rather than `p0d1`

https://github.com/vllm-project/aibrix/pull/1409 this is partially supported. However, it just consider the prefix cache hits but ignore the overall replica load. We need to fix it in follow up PR.

@jiangxiaobin96 for some scenarios, P & D are deployed on the same host due to lack of RDMA etc. it's better to route request within that group instead of choosing...