Yihua Cheng

Results 77 comments of Yihua Cheng

Hey @wz1qqx , we will be happy to merge it if you fix the DCO and pre-commit issue.

This kernel does not work well under high concurrency. Closing this PR..

Yeah, we do want to do this. Btw, for CacheGen (or other serdes in the future), the plan is to move it into GPUConnector (we are still thinking about the...

@hammersam Thanks for the proposal. Feel free to create an RFC issue and talk about your plan there. IIUC, there are 2 things to do: - check the compression ratio...

Hey @hickeyma, I think this makes sense at a high level. A few comments and questions 1. I think we should use vLLM agnostic data structures for the KV events...

Regarding the usage of KV cache events, I know there are some ongoing projects in the community that implement routing logic based on the KV events. And the KV events...

Like this! We should put something into the contributing guide (do we have a contributing guide now?)

cc @sammshen @kobe0938 @KuntaiDu @YaoJiayi @hickeyma

@maobaolong Thanks for the follow-up. I'm actually good with the current PR. Let me know if you would like to proceed. My thoughts on how to address the "unpin never...

cc @maobaolong @hickeyma . Let me know if you have any other thoughts on this