aibrix [Question] How to access the vLLM-Vineyard integration code mentioned in Distributed KV Cache documentation?

The Distributed KV Cache documentation references a customized vLLM implementation with Vineyard integration.

However, I couldn't locate the corresponding code implementation.

Could you help me clarify:

Is the customized vLLM code with Vineyard support publicly accessible?
If available, is it in a separate branch/repository?

Feb 24 '25 03:02 cheyang

Hi Yang,

https://github.com/vllm-project/aibrix/blob/6feec99d77c84e371da9c535054c2b8aa8912704/pkg/controller/kvcache/kvcache_controller.go#L505

I think it uses a customized vllm, something like https://github.com/vllm-project/vllm/compare/main...aibrix:vllm:feat/distributed-kv-cache

From what I understand, this only works with v0.6.x of vLLM.

Feb 24 '25 06:02 gaocegege

Great to see you here @cheyang long time no see. Yeah, @gaocegege gave the code pointer in vLLM.

vLLM code will be refactor to adapt to v1 architecture and a RFC will be cut soon. this part will be definitely upstreamed.
@DwyaneShi made some changes like metadata optimization and advanced eviction policies in vineyard. @DwyaneShi feel free to comment if you have more information like to share.

Feb 24 '25 21:02 Jeffwan

Hi Yang,

aibrix/pkg/controller/kvcache/kvcache_controller.go

Line 505 in 6feec99

fmt.Sprintf(/usr/local/bin/vineyardd --sync_crds true --socket /var/run/vineyard.sock --size --stream_threshold 80 --etcd_cmd etcd --etcd_prefix /vineyard --etcd_endpoint http://%s-etcd-service:2379, kvCache.Name), I think it uses a customized vllm, something like vllm-project/[email protected]:vllm:feat/distributed-kv-cache

From what I understand, this only works with v0.6.x of vLLM.

Thanks for pointing this out!

Feb 25 '25 06:02 cheyang

I think it uses a customized vllm, something like vllm-project/[email protected]:vllm:feat/distributed-kv-cache

Is this the actual code for the distributed KV cache used in AIBrix? I would like to make changes to it. Where can I find the actual implementation? Thanks.

Mar 06 '25 17:03 nechamab1

@nechamab1 we release latest version of kv cache support. Please check documentation for more details.

Now, all the code are hosted in this repo https://github.com/vllm-project/aibrix/blob/main/python/aibrix_kvcache/integration/vllm/vllm_v0.8.5-aibrix-kvcache.patch

this version improves the performance a lot and I suggest you to take a look at it

May 23 '25 21:05 Jeffwan