baselines icon indicating copy to clipboard operation
baselines copied to clipboard

Recurrent Neural Net compatability of PPO1

Open Bick95 opened this issue 4 years ago • 0 comments

Hey, I was wondering whether the variant of vanilla PPO with the clipped surrogate objective (version PPO1, so to say), as presented in the corresponding paper, is fully compatible with Recurrent Neural Nets or whether there is something that prevents it from being compatible with them.

From the two PPO implementations provided in this repo, I saw that at least the first version of PPO, i.e. PPO1, is not compatible with RNNs. But I assume that this is more about the concrete implementation rather than about the theory behind PPO1?

Since it is stated in the report that especially the finite-horizon estimator for the Advantage function is "well-suited for the use with recurrent neural networks", I would assume that even PPO1 (in theory) was compatible with RNNs.

Thank you in advance!

Bick95 avatar Dec 12 '20 16:12 Bick95