Yihua Cheng

Results 77 comments of Yihua Cheng

@XinyuJiangCMU Hey, thanks for your interest! Let me assign it to you. Looking forward to your PR!

@maobaolong I think there is another ongoing effort for CPU offloading: #19854

@chenqianfzh @rainj-me Just curious, how much overhead will it introduce if we do not save KV cache but let decoding instance to decode 1 token

@hickeyma Hey Martin, I thought this PR is not needed since it will not be used with the latest vLLM anymore. @chenqianfzh @rainj-me Please let us know if we can...

@wangxiaoyang-dev Good catch! This is a bug. Feel free to create a PR to fix this.