agential
agential copied to clipboard
[Feature Request]: Reflexion
Feature Description
Implement:
- [x] HotpotQA
- [x] #123
- [x] #122
- [x] #124
- [x] #125
- [x] #126
- [x] #127
- [x] #128
- [x] #129
- [ ] #131
- [ ] #132
- [ ] #133 (includes ALFWorld & WebShop)
The decision-making benchmarks (ALFWorld, WebShop, and AgentBench) will require more design work. Swapping out the prompts won't suffice.
Run:
- [ ] HotpotQA
- [ ] TriviaQA
- [ ] AmbigNQ
- [ ] GSM8k
- [ ] SVAMP
- [ ] TabMWP
- [ ] MBPP
- [ ] HumanEval
- [ ] ALFWorld
- [ ] WebShop
- [ ] AgentBench (includes ALFWorld & WebShop)