KIVI The difference in batch size leads to different results in LongBench testing

The difference in batch size leads to different results in LongBench testing

Open Felixvillas opened this issue 8 months ago • 5 comments

When I took the LongBench test with batch_size=1, I got the same results as in Table 4 of the paper. However, when I increased the batch size, the results were worse than with batch_size=1. For example, the result with model=Llama2-7B, dataset=TREC, batch_size=1, group_size=32, residual_length=128, k_bits=2, v_bits=2 was 66.0, but it was 57.0 with batch_size=2.

Moreover, the results with 16 bits remain consistent across different batch_size, which indicates that the Llama model's transformer supports it.

Is there a shared state or cache within a batch that could be causing this issue?

Hope to hear from you soon.

Jun 05 '24 11:06 Felixvillas

KIVI KIVI copied to clipboard

The difference in batch size leads to different results in LongBench testing

KIVI
KIVI copied to clipboard