Results 1 issues of Githuber-zwb

I'm a bit confused about the PPO update process. In the line 110: ![Screenshot from 2024-06-06 11-21-26](https://github.com/zcaicaros/L2D/assets/71386827/8a9ed211-bd73-4ef9-8178-c50ea4fed5b0) The rewards in a single episode ​​are normalized by subtracting the mean while...