stable-baselines3 icon indicating copy to clipboard operation
stable-baselines3 copied to clipboard

[Question] The use of close() for remote in SubprocVecEnv

Open dwyzzy opened this issue 2 years ago • 1 comments

❓ Question

Hi! I'm wondering about the close() in SubprocVecEnv.

When initializing work_remote and remote, stable-baselines3 code uses close() for work_remote :

        for work_remote, remote, env_fn in zip(self.work_remotes, self.remotes, env_fns):
            args = (work_remote, remote, CloudpickleWrapper(env_fn))
            # daemon=True: if the main process crashes, we should not cause things to hang
            process = ctx.Process(target=_worker, args=args, daemon=True)  # pytype:disable=attribute-error
            process.start()
            self.processes.append(process)
            work_remote.close()

After initializing. the code in _worker uses close() for parent_remote:

def _worker(
    remote: mp.connection.Connection, parent_remote: mp.connection.Connection, env_fn_wrapper: CloudpickleWrapper
) -> None:
    # Import here to avoid a circular import
    from stable_baselines3.common.env_util import is_wrapped

    parent_remote.close()

So both work_remote and remote use close()? How can we use pipe to communicate later? Does it indeed need here?

Checklist

  • [X] I have checked that there is no similar issue in the repo
  • [X] I have read the documentation
  • [X] If code there is, it is minimal and working
  • [X] If code there is, it is formatted using the markdown code blocks for both code and stack traces.

dwyzzy avatar Dec 06 '22 11:12 dwyzzy

Hello, the original code comes from https://github.com/openai/baselines/commit/bb403781182c6e31d3bf5de16f42b0cb0d8421f7#diff-e3578f89a8370e447f38686c0121044f4ba262b016085e8da4c981fbd784515c (by @joschu)

So both work_remote and remote use close()? How can we use pipe to communicate later? Does it indeed need here?

If you look closely and read the Pipe() doc, the parent process closes the work_remote (called parent_remote in the worker) and the child processes close the parent_remote (called remote in the worker), so they can still communicate (the parent will use self.remotes and the child will use self.work_remote).

I agree the code is confusing, I guess they used two pipes instead of one to avoid dead lock (or waiting for a connection to be free) and they close the remote not used to be more explicit. But the code could be simplified to use only one remote (if you try and there is no problem, we could welcome a PR ;))

cc @AdamGleave @Miffyli @hill-a @ernestum @qgallouedec in case you have a better explanation?

araffin avatar Dec 06 '22 12:12 araffin