mistral.rs
mistral.rs copied to clipboard
Any plan about KV compression algorithm like SnapKV and PyramidKV?
Hi, I'm wondering if you have any plans regarding kv compression methods like SnapKV and PyramidKV. These methods can reduce the use of memory for KV cache, hence improving availability on low-memory machines. Maybe I can make some contributions to this.