Bowen Li
Bowen Li
Hey @mlejva, me and @xingyaoww are currently working on integrating the SWE-Bench environment into the agent pipeline. Concurrently, we're examining various design options for this integration. Basically, we've prepared a...
Hi @mlejva, that sounds great! I'm not familiar with E2B. Could you please share some documentation or code where I can learn more about it? We are still in the...
@mlejva Sure, no problem!
@rezzie-rich Thank you for the question! As @frankxu2004 clarified, we only report the pass@1 results in the graph. Our evaluation containerization only supports SWE-bench-lite for now and we will extend...
Hey guys, we have a private stabilized version of SWE-bench evaluation pipeline but it is now behind the official SWE-bench repo. We will push the changes to the forked SWE-bench...