mistral.rs Initial KV RingAttention code

Initial KV RingAttention code

Open joshpopelka20 opened this issue 6 months ago • 6 comments

This is the start of the RingAttention code. The changes so far have been to create multiple KV caches (if multiple num_devices) and to try to create separate chunks.

Aug 14 '24 20:08 joshpopelka20

mistral.rs mistral.rs copied to clipboard

Initial KV RingAttention code

mistral.rs
mistral.rs copied to clipboard