rlpyt
rlpyt copied to clipboard
[LSTM PPO] how to increase batch_B without creating multiple environment objects?
Hello,
When training an LSTM PPO agent, I was wondering whether there is a way to sample multiple batches of length batch_T in between PPO updates i.e., batch_B > 1, without having to create multiple environment objects as follows: https://github.com/astooke/rlpyt/blob/f04f23db1eb7b5915d88401fca67869968a07a37/rlpyt/samplers/serial/sampler.py#L44 I'm asking because I'm using a gazebo simulated environment with ROS, which is not only extremely computationally demanding when instantiated multiple times, but also difficult to containerize the internal messages sent within each of the environment objects.
Thank you in advance :)