Yihua Cheng

Results 10 issues of Yihua Cheng

### Your current environment The output of `python collect_env.py` ```text INFO 03-01 00:48:13 [__init__.py:207] Automatically detected platform cuda. Collecting environment information... PyTorch version: 2.5.1+cu124 Is debug build: False CUDA used...

bug

### **TL;DR:** This PR opens the KV connector API in v1 to support disaggregated prefill. It also includes a minimal functional implementation as an example of how to use the...

documentation
ready
ci/build
v1

**TL;DR:** This PR implements CPU-based connector for PD disaggregation, with the following features - [x] Async layerwise D2H copies at prefiller side - [x] (WIP) Async layerwise H2D copies at...

v1

# Background We currently have a comprehensive suite of unit tests that cover the internal functionalities of LMCache. However, these tests are mostly scoped within the LMCache repository and do...

help wanted
Testing
discussion

**Is your feature request related to a problem? Please describe.** Currently, the configuration file is in a flat yaml, and the config class is a flat dataclass. It will become...

enhancement
good first issue
Refactoring

**Is your feature request related to a problem? Please describe.** In #678, we've seen a performance problem in the current GPU connector implementation, and @yanok provides a fix in `VLLMPagedMemGPUConnectorV2`...

enhancement
good first issue
help wanted

After #843 is merged, the LMCache-vLLM integration can be totally detached from vLLM. We need to update the doc to let users know how to use the new LMCache-vLLM connector.

good first issue

This issue is used for tracking the follow-up tasks after #446 is merged. - [x] #474 - [ ] MLA KV cache shape support - [x] #463 - [x] #464...

**What this PR does / why we need it**: This PR adds new memory copy kernels using cuda batched memcpy. **Special notes for your reviewers**: The new kernel requires cuda...

# Description Recently, vLLM has added support for native Prometheus metrics for KV connectors (see vllm-project/vllm#26811). Right now, the LMCache Prometheus support uses the local file system to pass the...

good first issue
help wanted
new feature