Zhilong Zheng
Results
2
issues of
Zhilong Zheng
Hi, I'm using RecurrentPPO to train an RecurrentActorCriticPolicy. I noticed that when collecting rollouts data, the hidden states in LSTM at each time steps are also storaged in the rollout...
question