jn12-29 comments

Results 3 comments of


                                            jn12-29

> 这个和linear attn的特性有关，普通attn的kv cache缓存中可以分离出前缀的kv cache，但linear attn不行后续我会做一些工作尽量增强缓存命中能力可以考虑每段对话后复制一份linear attention部分的kv到ssd上？

会不会是kilo code的提示词太长了，听说cline的提示词就有11k

I also encountered this issue. Have you resolved it? Could you share the solution?