Haiyang Shi
Haiyang Shi
> I used wrong endpoint here but even I use wrong one, does IPC connection helps? > > ``` > - name: AIBRIX_LLM_KV_CACHE_RPC_ENDPOINT > value: "aibrix-mode-deepseek-coder-7b-kvcache-rpc:9600" > ``` > >...
> [@Jeffwan](https://github.com/Jeffwan) [@DwyaneShi](https://github.com/DwyaneShi) Could you check this issue? I have been just waiting for the engine pods to be restarted. But it is not supposed to be like that. This...
@gangmuk removing IPC is an option. right now we actually do not heavily rely on the IPC connection, feel free to have a look at the codebase to see if...
@stellarzhou Thanks for trying out vineyard-based kv cache. In this vineyard-based impl, there is a client-side cache within vineyard's client (its capacity is also VINEYARD_CACHE_CPU_MEM_LIMIT_GB), which utilizes S3FIFO eviction policy...
@jlcoo Thanks for trying out the distributed kv cache offloading feature, we will support more attention backends soon, please stay tuned.
@jlcoo We have release v0.3.0 recently, and it supports XFormers backend now. It would be great if you could have a try on the latest version. Please refer to the...
> > [@libin817927](https://github.com/libin817927) what's your environments? Are you running on volcano engine? if so, please share the node image details. If not, please help me know how to allocate RDMA...
@libin817927 Thanks for trying out the kv cache offloading feature. If you'd like to use AIBRIX_KV_CACHE_OL_L1_CACHE_CAPACITY_GB=80 (i.e., using extra 80GB DRAM for each engine process for kv cache offloading purpose),...
> hello, may i ask have you resolve this problem or figure out what happen to it ? today i try this feature occur the same situation. > > (base)...