Jiayi Yao

Results 7 issues of Jiayi Yao

LMCache (https://github.com/LMCache/LMCache/tree/dev) uses the `kv_transfer` interface to support both KV cache offloading and disagg prefill. The original interfaces `recv_kv_caches_and_hidden_states` and `send_kv_caches_and_hidden_states` in `kv_connector` are used as wrappers to call `lmcache_retrieve_kv`...

XPYD support. Examples can be found at: (1) 1p1d: https://github.com/LMCache/LMCache/blob/3e981844519c24abf21c55e86836ff509be3e8ae/examples/disagg_prefill/1p1d_experimental/README.md (2) 2p2d: https://github.com/LMCache/LMCache/blob/3e981844519c24abf21c55e86836ff509be3e8ae/examples/disagg_prefill/xpyd_experimental/README.md Key next steps: (1) TP + Hetero TP support (2) Compatibility with prefix caching/cpu offloading (3) Async...

There's a config varaible called `save_decode_cache` https://github.com/LMCache/LMCache/blob/dev/lmcache/v1/config.py#L58 which indicates whether to save the decode KV cache or not. However, the config variable is not effective in lmcache/vllm v1 anymore. To...

good first issue

In this upcoming quarter, we'll be looking to improve LMCache significantly in the following areas. - Higher Efficiency - Better Infra - Larger Community To achieve this goal, we propose...

help wanted

We can use `auto()` to generate the values for the enum entries (e.g., in lmcache/v1/memory_management.py)

good first issue

Support `pin(instance_id: str, tokens: List[int], ttl: Optional[float])`. TODO: - [ ] Make global view in controller consistent with cache expiration (might need a periodic sweeper to do this). --- PR...

Add batched get interface. This interface makes the storage backends be able to perform batched get operation. PR https://github.com/LMCache/LMCache/pull/912 and https://github.com/LMCache/LMCache/pull/863 can leverage this. --- PR Checklist (Click to Expand)...