Lu Changqi
Lu Changqi
Add a benchmark result. 2. Online server benchmark 2.3 Recompute(997 times recompute): The recompute result is basically the same as the result of swapping to the CPU. Server: `python3 -m...
@DarkLight1337 @ywang96 @youkaichao hi, I have simply implemented an external storage. Please help review the codes.
@zhouyuan @tmm1 @orsharir Ping !
It‘s a good job!
@KuntaiDu hi !According to your commit, I implemented valkey (supporting TCP and rdma) as the kv cache storage pool in the prefill and decode nodes(https://github.com/vllm-project/vllm/pull/8724). Due to license reasons, I...
> > @KuntaiDu hi !According to your commit, I implemented valkey (supporting TCP and rdma) as the kv cache storage pool in the prefill and decode nodes(#8724). Due to license...