Cheng Wan
Results
12
comments of
Cheng Wan
The first cache `intermediate_cache1` can be released when we compute `intermediate_cache3`. How about reusing `intermediate_cache1`? We can achieve this by updating their initialization.
https://github.com/sgl-project/sglang/issues/4673 encounters the same issue. Follow https://github.com/sgl-project/sglang/issues/4673#issuecomment-2745578452 may fix your issue.