AgentGym icon indicating copy to clipboard operation
AgentGym copied to clipboard

Inconsistent number of instructions for sciworld_test.json on HF dataset

Open xingjianleng opened this issue 7 months ago • 3 comments

Dear authors,

Thanks for your great work!

I'm trying to reproduce the evaluation results as shown in the paper. However, I just noticed a difference in the number of instructions between the paper and the code.

Table 2 of the paper says there are 200 evaluation instructions for the Sciworld environment, but there are 1042 samples in the sciworld_test.json on AgentEval HF dataset. Also, the conversation contents should be [], rather than all the trajectories.

Could you please update the sciworld_test.json file on HF datasets to the correct version, which should contain 200 samples and is without any conversation content?

Thanks in advance.

xingjianleng avatar Jul 18 '24 04:07 xingjianleng