Jiaxin Shan
Jiaxin Shan
@vivekskrishna we will provide a cloud native solution soon, the main purpose is to cover P/D pooling and xPyD mode, it will also cover the general TP or PP multi-host...
@vivekskrishna can you check this solution? https://aibrix.readthedocs.io/latest/designs/aibrix-stormservice.html We will use this controller to manage multi-node and P/D cases in future
Stormservice is the primary orchestration offering in aibrix, It can be used to support P/D and take the role of RayClusterFleet as well. We will not adopt LWS at this...
@xinji1 do you think extending the queue support outside the engine would be a choice? I feel lots of batch scenarios need global information. I am thinking if there's better...
Some code changes might be conflict with https://github.com/vllm-project/aibrix/pull/878. that's not a problem now. We can review the core logic first and then rebase the changes later.
I will have a check today
/cc @Xunzhuo @varungup90 Please help review it. @zhangjyr this is still a huge PR, if we have some agreement on the data structure and helper utilities. I suggest to add...
@Xunzhuo jingyuan fixed the integration test issues, do you get a chance to help take a look?
@Xunzhuo Just wanted to loop you in—this PR is currently blocking some performance-related changes (prefix-cache aware routing) that community users are waiting for. it's quite large and deserves more attention....
/cc @varungup90 Please help take a look