Richard comments

Repositories
Issues
Comments

Results 1 comments of


Richard

Issue with critic target in PPO

> In the [line used to define the returns](https://github.com/philtabor/Youtube-Code-Repository/blob/1ef76059bf55f7df9ccc09fce0e0bfb7c13e89bd/ReinforcementLearning/PolicyGradient/PPO/torch/ppo_torch.py#L186), we use the GAE + values as the target for the critic to learn. Is this correct? > > My intuition...