Nachama Bar Natan comments

Results 5 comments of


                                            Nachama Bar Natan

CacheBlend fails on sending exact same prompt a second time

Related to https://github.com/LMCache/LMCache/issues/845#issuecomment-2985296839

CacheBlend fails on sending exact same prompt a second time

@YaoJiayi Could you please review this PR? Thanks! Just to clarify: this PR addresses the case where some chunks are retrieved from storage and others from HBM. As far as...

[Question] How to access the vLLM-Vineyard integration code mentioned in Distributed KV Cache documentation?

> I think it uses a customized vllm, something like [vllm-project/[email protected]:vllm:feat/distributed-kv-cache](https://github.com/vllm-project/vllm/compare/main...aibrix:vllm:feat/distributed-kv-cache) Is this the actual code for the distributed KV cache used in AIBrix? I would like to make changes...

Questions about CacheBlend Implementation

Hi @YaoJiayi , Thanks a lot for the clear answer! Regarding the `old_positions` update, I couldn’t figure out who is responsible for updating `memory_obj.metadata.old_positions` with the correct values. I would...

Questions about CacheBlend Implementation

Hi, I have another question 🙂: regarding the following code: (http://github.com/LMCache/LMCache/blob/dev/lmcache/v1/compute/blend/blender.py#L93) ``` if layer_id in self.common_metadata.check_layers: diff_k = torch.sum( (k.to(torch.float32) - old_k.to(torch.float32)) ** 2, dim=[1] ) total_len = diff_k.shape[0] ```...