OpenHands
OpenHands copied to clipboard
feat: SWE Agent Implementation
WIP: Implementing an agent that works similar to SWE-Agent
Agent Features:
- [x] Application Computer Interface - simplifies the commands required by the llm.
- [x] Implement: 'read', 'write', 'browse', 'exit', 'edit', 'goto', 'scroll'
- [x] Think-Act prompting structure - guides the model to think about what its next step should be and then make code to do it
- [x] Short Term Memory - Allows the model to see it's last n steps before taking the next one
- [x] Tells the model about its current working dir, file, and line
Help / Suggestions:
- Open to other ideas people might have on how to improve this
- This agent should be similar to SWE agent, so I would appreciate ideas staying related to that
- Any help or additional code would be much appreciated
Links/context:
- SWE-Agent Demo
- Repo
- Issue #570
Agent/model | SWE-bench % resolved |
---|---|
SWE-Agent (GPT-4) | 12.29% |
Devin (25% of eval set) | 13.84% |
Claude 3 Opus (RAG) | 3.79% |
GPT-4 (RAG) | 1.44% |
Since SWE-Agent is so close to Devin in performance, I figure trying to emulate it and then iterate on it is the best route.