Zhilong Zheng

Results 2 issues of Zhilong Zheng

Hi, I'm using RecurrentPPO to train an RecurrentActorCriticPolicy. I noticed that when collecting rollouts data, the hidden states in LSTM at each time steps are also storaged in the rollout...

question