AgentEvol-7B has too short context length; Training scripts are not the same as in the paper.

Open nntoan209 opened this issue 2 months ago • 2 comments

Hi, can you please explain these problems:

The training scripts are not complete. In the paper, you stated that there are 2 training phases: Behavioral cloning (BC) and AgentEvol, which is a method to investigate the potential of agent self-evolution. However, the provided codes for the 2 training phases seem to be the same, I think the AgentEvol code is not the same as described in the paper.
How is the model AgentGym/AgentEvol-7B evaluated? This model has a context length of 2048, and when I tried to evaluate this model on some environments, it failed because the context length is too short. Could you provide the evaluation code to reproduce the reported results?

Oct 02 '25 17:10 nntoan209

同问，如何复现这个分数

Nov 08 '25 08:11 lzh1998-jansen

同问，甚至代码都跑不了，会卡住

Nov 26 '25 08:11 SHIFTTTTTTTT