OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

feat: add commands for swebench

Open Sparkier opened this issue 1 year ago • 2 comments

Sparkier avatar Apr 03 '24 23:04 Sparkier

Should this code be in the ./evaluation dir? Why is it in codeact?

rbren avatar Apr 04 '24 03:04 rbren

Yeah I was absolutely not sure where to put things and whether these commands should be just for the eval or something the agent should generally be augmented with.

Sparkier avatar Apr 04 '24 03:04 Sparkier

This is looking good to me! Maybe we should ask the eval folks?

rbren avatar Apr 05 '24 03:04 rbren

I actually don't think evaluation is dependent on the agent / opendevin design - they only need one JSONL file that contains one patch output for each evaluation instances, so i think our current implementation should be fine?

xingyaoww avatar Apr 05 '24 04:04 xingyaoww

Are these functions meant for swe BENCH, as the title indicates? or for swe AGENT as described in this issue: https://github.com/OpenDevin/OpenDevin/issues/570

foragerr avatar Apr 05 '24 12:04 foragerr

Are these functions meant for swe BENCH, as the title indicates? or for swe AGENT as described in this issue: #570

Well, these functions are akin to what the folks from SWE Agent did but are meant to improve performance on SWE Bench. So pick whichever you like. It also seems like we're thinking of applying this stuff more broadly even beyond both.

Sparkier avatar Apr 05 '24 16:04 Sparkier