OpenHands
OpenHands copied to clipboard
How to reproduce OpenHands' performance on SWE-Bench-Verified.
Hi, I am trying to reproduce OpenHands' score on SWE-Bench-Verified. Could you please provide some instructions for reproduction. Many thanks.
If you're trying to run the tests yourself, take a look at the instructions in the README here. You'll have to manually replace all references to princeton-nlp/SWE-bench_Lite with princeton-nlp/SWE-bench_Verified.
Thank you so much @csmith49 . I would try it.
Could you please provide the hyper-parameters, such as config.toml for reproducing the score of openhand-codeact-2.1 (claude-sonnet) on swe-bench leaderboard? @csmith49
I use the default setting with claude, and I only get the output like:
ERROR:root:<class 'RuntimeError'>: Maximum error retries reached for instance astropy__astropy-12907
Instances processed: 0%| | 0/300 [00:25<?, ?it/s]
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been stalled for over 30 days with no activity.