promptulate
promptulate copied to clipboard
benchmark for Agent
🚀 Feature Request
We need benchmark to eval the ability of Agent.
References
~~- https://github.com/THUDM/AgentBench~~ AgentBench is evlaute different LLM models.
- https://toolemu.com/
- https://mp.weixin.qq.com/s/0FZrgFosHzzYFBRiV3ba2g