Super-mario-bros-PPO-pytorch about train.py

about train.py

Open xiaolonghao opened this issue 4 years ago • 1 comments

Hello, look at your code, feel some questions. It's on line 114 of train.py, i.e. values = torch.cat(values).detach(). I think this statement should come after line 123. In line 120, i.e., gae = gae + reward + opt.gamma * next_value.detach() * (1 - done) - value.detach(). It will become a constant（size=1） instead of a vector（size=8）. You can check it and solve my question. Thank you.

Sep 16 '20 12:09 xiaolonghao

I think it should be as you

Oct 23 '23 11:10 zhuzhu18

Super-mario-bros-PPO-pytorch Super-mario-bros-PPO-pytorch copied to clipboard

about train.py

Super-mario-bros-PPO-pytorch
Super-mario-bros-PPO-pytorch copied to clipboard