ktransformers icon indicating copy to clipboard operation
ktransformers copied to clipboard

[Feature] Insert generated response in kv cache

Open aikitoria opened this issue 5 months ago • 0 comments

Checklist

  • [x] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/kvcache-ai/ktransformers/discussions. Otherwise, it will be closed.
  • [x] 2. To help the community, I will use Chinese/English or attach an Chinese/English translation if using another language. Non-English/Chinese content without translation may be closed.

Motivation

I noticed when I am generating responses for a single user chat (alternating user and ai messages) then after the ai has generated a message and I send a new prompt following that, it has to re-process the last ai message as if it was a new input. This is unnecessary, it can be inserted into the cache during generation.

Related resources

No response

aikitoria avatar Jul 30 '25 13:07 aikitoria