Yihua Cheng comments

Results 77 comments of


                                            Yihua Cheng

WSL OOM Error When Running 14B INT4 Quantized Model on 24GB GPU with lmcache_vllm at 90% gpu-memory-utilization

Hey @chrisoutwright , thanks for submitting issue to LMCache. Currently, LMCache uses some temporary GPU buffer when copying out the KV cache from vLLM. Therefore, it needs some extra GPU...

Introduce a remote monitor thread to monitor remote and support fallback to blackhole

On my list now. I'll review it when I got a chance!

Introduce a remote monitor thread to monitor remote and support fallback to blackhole

@YaoJiayi Please take a quick look, thanks!

Introduce a remote monitor thread to monitor remote and support fallback to blackhole

Jiayi may have limited availability these days @Shaoting-Feng @sammshen Can you guys take a look at this PR?

[Misc] add two simple utility functions

@yanok Thanks for the PR! Can you fix the format checker by running "format.sh" locally? Thanks!

[V1] [P/D] Add Support for KV Load Failure Recovery

Thanks @sdavidbd ! I like the idea of this PR. This will also be very useful for LMCache's use cases.

[bug] Local CPU memory leaking when using remote connector

Thanks @ningziwen! Regarding this: > I'm seeing multiple cases of local cpu memory leaking. > > 1. All of items are pinned and never unpinned. Local CPU backend state: total_items=18,...

[Bug] ERROR:The number of retrieved tokens is less than the expected number of tokens! This should not happen!

We have fixed this problem before. You can try a new version of LMCache (`lmcache == 0.3.5`) For more detailed compatibility settings, please see https://docs.lmcache.ai/getting_started/installation.html#compatibility-matrix

[RFC]: KV cache offloading

Thanks for the great work! Looks like the design is pretty similar to the existing KV connector API. Why don't we just directly use the existing connector API? cc @KuntaiDu

[RFC]: KV cache offloading

@josephrocca Hi, can you point me to the issue you submitted in LMCache? I'm not able to find that by filtering by your ID. But sorry if we have missed...