Cheng Wan

Results 12 comments of Cheng Wan

The first cache `intermediate_cache1` can be released when we compute `intermediate_cache3`. How about reusing `intermediate_cache1`? We can achieve this by updating their initialization.

https://github.com/sgl-project/sglang/issues/4673 encounters the same issue. Follow https://github.com/sgl-project/sglang/issues/4673#issuecomment-2745578452 may fix your issue.