H2O
H2O copied to clipboard
Eviction doesn't happen at all if recent_budget=0
From https://github.com/FMInference/H2O/blob/281ffef3f1432ceb1a6899362d2f20e1ef13aa94/h2o_hf/utils_hh/modify_llama.py#L140-L156
If recent_budget=0:
the mask set to one at first, attn_mask = torch.ones
;
but scatter to one attn_mask = attn_mask.scatter(-1, keep_topk, 1);
attn_mask[:, :-self.recent_budget] = 0
only works for recent_budget !=0