gym icon indicating copy to clipboard operation
gym copied to clipboard

[Bug Report] Vector env return value

Open Root970103 opened this issue 1 year ago • 1 comments

Describe the bug Hi, I used the vector env api in gym to train Atari-PongNoFrameSkip-v4. After the agent interacts with the environment for a period of time, I discovered a strange phenomenon. The cumulative reward was 21.0, but the corresponding done status was still False.

An intuitive example is described below:

rewards: [21.0, 19.0, 17.0, 21.0, 18.0, 21.0]
done: [True, False, False, True, False, False]

In this case, the env reached the max reward cannot be set done. And the cumulative reward would increased. Is this situation normal?

System Info gym==0.18

Root970103 avatar Dec 21 '23 10:12 Root970103

Without more of your code it is difficult to tell what is happening also this is for v0.18 which is several years old so we wouldn't be updated any code unless this is still an issue now

pseudo-rnd-thoughts avatar Dec 25 '23 20:12 pseudo-rnd-thoughts