aibrix icon indicating copy to clipboard operation
aibrix copied to clipboard

Support storm service roleset and roles in AIBrix routing cache

Open Jeffwan opened this issue 5 months ago • 5 comments

🚀 Feature Description and Motivation

Currently, we only support pod in router cache. Even for multiple node setups, we only fetch ray-head pod instead.

Now, we introduce storm services which support replica mode and pooling mode. In order to support P/D disaggregation case, we need to do some refactor on internal cache to support roles and roleset concepts.

Use Case

Support P/D disaggregation routing

Proposed Solution

No response

Jeffwan avatar Jul 10 '25 07:07 Jeffwan

I will move this issue to future release. currently, it can detect the role but doesn't support hierarchy

Jeffwan avatar Aug 01 '25 17:08 Jeffwan

there's one community user asking this feature, he want to request to p0d0 or p1d1 rather than p0d1

Jeffwan avatar Aug 05 '25 22:08 Jeffwan

https://github.com/vllm-project/aibrix/pull/1409 this is partially supported. However, it just consider the prefix cache hits but ignore the overall replica load. We need to fix it in follow up PR.

Jeffwan avatar Aug 08 '25 21:08 Jeffwan

Hello, I wonder know why need to put prefill and decode pods into same roleset here?

jiangxiaobin96 avatar Sep 11 '25 11:09 jiangxiaobin96

@jiangxiaobin96 for some scenarios, P & D are deployed on the same host due to lack of RDMA etc. it's better to route request within that group instead of choosing P or D random from the entire pool.

this has been supported now. We can close the issue

Jeffwan avatar Nov 18 '25 09:11 Jeffwan