jxjessieli
Results
2
comments of
jxjessieli
I conducted the same set of experiments and found the same results. I think this is because with the setting of threshold=0.2, the retrieval frequency is 100%. The same can...
I think you should specify the "task" argument accordingly. `python run_baseline_lm.py --model_name meta-llama/Llama-2-7b-hf --input_file ./eval_data/arc_challenge_processed.jsonl --max_new_tokens 20 --metric match --result_fp ./eval_results/llama2_7b_arcc_results.json --task arc_c` overall results: 0.28498293515358364 This result is higher...