libin817927 comments

Results 7 comments of


                                            libin817927

[RFC]: Add Support for Prefill/Decode (P/D) Disaggregation in vLLM

Crucial feature, eagerly awaited.

How to deploy a scalable cluster based on Aibrix

Hi, I'm trying to deploy in the company's k8s cluster. I expect to deploy a small cluster of the Qwen3 32B model on 4 Nvidia L4 GPUs. I'll describe my...

How to deploy a scalable cluster based on Aibrix

> [@libin817927](https://github.com/libin817927) thanks for your details questions. This is fair and reasonable asks. I happen to do some testing against v0.3.0-rc release. I can list missing piece, especially the kv...

kvcache cluster cannot start

> [@libin817927](https://github.com/libin817927) what's your environments? Are you running on volcano engine? if so, please share the node image details. If not, please help me know how to allocate RDMA resources...

kvcache cluster cannot start

> > > [@libin817927](https://github.com/libin817927) what's your environments? Are you running on volcano engine? if so, please share the node image details. If not, please help me know how to allocate...

Is Prefill-Decode disaggregation supported in AIbrix

> [@TianTengya](https://github.com/TianTengya) Yes. P&D is not the focus, we are busy with kv cache solutions and plan to fully unblock prefix-cache scenarios first. the next step would be xPyD. I...

deployment of l1 kv cache using YAML has failed

> [@libin817927](https://github.com/libin817927) Thanks for trying out the kv cache offloading feature. If you'd like to use AIBRIX_KV_CACHE_OL_L1_CACHE_CAPACITY_GB=80 (i.e., using extra 80GB DRAM for each engine process for kv cache offloading...