AgentGym icon indicating copy to clipboard operation
AgentGym copied to clipboard

alfworld的成功率只有0.09

Open XGJ111 opened this issue 9 months ago • 4 comments

我使用提供的AgentEvol-7B模型,把max_round设置为"30",使用base_eval_script.sh进行测试,得到的结果如下

Image

XGJ111 avatar Mar 18 '25 01:03 XGJ111

一样的情况,能帮忙看看这个问题吗 @WooooDyy

carrot-codeing avatar Apr 14 '25 11:04 carrot-codeing

same case @WooooDyy

realtmxi avatar May 15 '25 12:05 realtmxi

same case @WooooDyy , the score of alfworld seems not align with the paper

MHlk avatar Jun 11 '25 05:06 MHlk

Hi, @XGJ111 @carrot-codeing @realtmxi @MHlk .

Thanks for your comments and feedback. We checked our evalutaion process as well as our ALFWorld implementations, and we found that the cause lies in the updated version of the original ALFWorld library.

We suggest to use alfworld==0.3.3 in AgentGym. Note that you may need to remove the game files of newer ALFWorld lib (e.g. rm -rf ~/.cache/alfworld by default) and re-run alfworld-download.

Andy15 avatar Jun 12 '25 09:06 Andy15