rlkit
rlkit copied to clipboard
Multiple worker support
Is there support for multiple workers (threads each feeding in independent samples)?
I'm trying to reproduce the pick and place here, and they mention needing 19 workers to do so: https://github.com/openai/baselines/tree/master/baselines/her
On the other hand, seems like since HER is off policy multiple workers aren't necessary..
No. However, if you're interested in implementing it, I would start with the batch training mode and just replace the "_take_step_in_env" call with some parallelized implementation.
I'm not planning on adding this any time soon, though I've been talking with some people at Berkeley who might be interested in doing this, either as part of this repo or separately. PRs would also be welcomed, and if you're interested in this let me know.
I'm going to re-open this and label it as a enhancement request. Though like I said, I probably won't get around to it any time soon.
Is there any progress on this?
@bycn You can check how I implemented it here: https://github.com/richardrl/rlkit-relational/blob/master/rlkit/torch/optim/mpi_adam.py
Thanks for the reply—actually, I ended up realizing I'm looking for parallelized sampling, not parallelized consumption.
@bycn Parallel sampling is also implemented. Check how "num_parallel_processes" is used here: https://github.com/richardrl/rlkit-relational/blob/master/examples/relationalrl/train_pickandplace1.py
It's unclear to me how parallel sampling is implemented here without vectorized environments, i.e., like https://stable-baselines.readthedocs.io/en/master/guide/vec_envs.html. Can you explain the approach?
I don't know what the vectorized environments are doing, but I am simply starting a Mujoco instance in multiple threads with a local replay buffer in each thread, sampling each environment to fill the local replay buffer, computing the gradient, and then taking the average of the gradient in the optimizer file you see above.
If you have further questions about my implementation, feel free to open an issue in that repo.