Meng Zhu

Results 2 issues of Meng Zhu

## TL;DR In V1, swap GPU KV cache blocks to CPU upon eviction and swap them back if there's a cache hit. ## Swap Strategy CPU → GPU swap-in happens...

v1

### Motivation. Offloading device KV cache to the CPU can be worthwhile if the transfer overhead outweighs the re-computation, saving precious GPU cycles. This is especially useful in cases such...

RFC