KIVI
KIVI copied to clipboard
The difference in batch size leads to different results in LongBench testing
When I took the LongBench test with batch_size=1
, I got the same results as in Table 4 of the paper. However, when I increased the batch size, the results were worse than with batch_size=1
. For example, the result with model=Llama2-7B, dataset=TREC, batch_size=1, group_size=32, residual_length=128, k_bits=2, v_bits=2
was 66.0, but it was 57.0 with batch_size=2
.
Moreover, the results with 16 bits remain consistent across different batch_size
, which indicates that the Llama model's transformer supports it.
Is there a shared state or cache within a batch that could be causing this issue?
Hope to hear from you soon.