KIVI Unable to Reproduce Results for LongBench

Unable to Reproduce Results for LongBench

Open ilil96 opened this issue 6 months ago • 2 comments

Hello,

I ran the code provided for LongBench using the Llama-3-8B-Instruct model but couldn't reproduce the results reported in Table 8 of your paper. Specifically, the full precision baseline model's score for Qasper in my run is 32.11, while the reported score is 44.24.

I used the following command to run the model: python pred_long_bench.py --model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct --k_bits 16 --v_bits 16

Is there anything I might be missing?

Aug 26 '24 10:08 ilil96

KIVI KIVI copied to clipboard

Unable to Reproduce Results for LongBench

KIVI
KIVI copied to clipboard