agents
agents copied to clipboard
Understanding parallel environments
Hi! I wanted to ask some questions about the current implementation of parallel environments in this framework.
As far as I understand, using ParallelPyEnvironment allows to have several environments running in parallel, each of them collection a trajectory of state/action/reward. What is not clear to me is how this is treated in the optimization part of the agents. Are these trajectories then treated individually by the optimizer? Also, what is the difference between ParallelPyEnvironment and BatchedPyEnvironment ?
cc @jerabaul29 and @franalcantara, feel free to ask/add information here :)
Thanks!