SnapKV
SnapKV copied to clipboard
add snapKV implementation for transformers sdpa attention with flash_attn availability checking
In the case that flash_attn_2 is not available.
Currently only add hijiack_llama, will add implementations for other models in a later time.