auto-code-rover How can the result be reformed into the one for SWE-bench evaluation?

Very wonderful work. I notice that swe-bench evaluation requires files including

eval.sh: The evaluation script
patch.diff: The model's generated prediction
report.json: Summary of evaluation outcomes for this instance
run_instance.log: A log of SWE-bench evaluation steps
test_output.txt: An output of running eval.sh on patch.diff

And in auto code rover we only get the json and patch.diff how can we get test_output.txt?

Thanks a lot!

Nov 11 '24 12:11 27yw

Hi! You would need to first transform the json into jsonl (with a simple python script for example), then evaluate the jsonl with SWE-bench's containerized evaluation. Then in SWE-bench/logs/ you will find these files.

Nov 11 '24 14:11 crhf

Hi @crhf, When I run AutoCodeRover on SWE-lite ( using docker image). I receive a file predictions_for_swebench.json

You mean using this file --> transform to jsonl --> evalute with SWE-bench containerized evaluation For example:

python -m swebench.harness.run_evaluation \
    --dataset_name princeton-nlp/SWE-bench_Lite \
    --predictions_path  **predictions_for_swebench.jsonl**\
    --max_workers 1
   --run_id evalution

the field --predictions will be predictions_for_swebench.jsonl. Is it correct ?

Jan 24 '25 00:01 minhnhatle104