Ryan H. Tran

Results 28 issues of Ryan H. Tran

This PR provides a draft evaluation integration for the MINT benchmark which tests the agent's ability to solve tasks with multi-turn interactions. This benchmark tests the agent's ability of code...

evaluation

**What problem or use case are you trying to solve?** The current search skills available to the agent is: ```python - search_dir(search_term, dir_path='./'): # Searches for a term in all...

enhancement
Stale

**Short description of the problem this fixes or functionality that this introduces. This may be used for the CHANGELOG** - This PR implements a simplified multi-agent workflow inspired by the...

enhancement
agent framework

### Is there an existing issue for the same bug? - [X] I have checked the troubleshooting document at https://docs.all-hands.dev/modules/usage/troubleshooting - [X] I have checked the existing issues. ### Describe...

bug

### Is there an existing issue for the same bug? - [X] I have checked the troubleshooting document at https://docs.all-hands.dev/modules/usage/troubleshooting - [X] I have checked the existing issues. ### Describe...

bug
evaluation
severity:medium

**End-user friendly description of the problem this fixes or functionality that this introduces** - [ ] Include this change in the Release Notes. If checked, you must provide an **end-user...

**End-user friendly description of the problem this fixes or functionality that this introduces** - [ ] Include this change in the Release Notes. If checked, you must provide an **end-user...

**End-user friendly description of the problem this fixes or functionality that this introduces** - [ ] Include this change in the Release Notes. If checked, you must provide an **end-user...

**End-user friendly description of the problem this fixes or functionality that this introduces** - [ ] Include this change in the Release Notes. If checked, you must provide an **end-user...

**End-user friendly description of the problem this fixes or functionality that this introduces** - [ ] Include this change in the Release Notes. If checked, you must provide an **end-user...

run-eval-m