tianshou icon indicating copy to clipboard operation
tianshou copied to clipboard

AsyncCollector seems not to collect asynchronous

Open meier-johannes94 opened this issue 2 years ago • 1 comments

This is more like a question, whether I made some mistake. I am using Dueling DQN and build my training procedure in a similar way like your atari DQN example. However resets of my environment instances take a long time. Therefore I am using an AsyncCollector and SubprocVectorEnv. I have the impression that when one environment resets the others wait till it is finished and then the next step in the envrionment is collected for all. However I was expecting that the AsyncCollector is able to handle such a situation more intelligently. Is there a mistake or a misunderstanding from my side?

Here a short part of my code:

env = return SubprocVectorEnv(
            [self.create_single_env(apartments[i])
             for i in range(self.args.parallel_envs)]
        )

self.train_collector = AsyncCollector(
            self.policy, self.envs, self.buffer, exploration_noise=True
)

I use a prioritized replay buffer:

return PrioritizedVectorReplayBuffer(
                self.args.buffer_size,
                buffer_num=len(self.envs),
                ignore_obs_next=True,
                alpha=self.args.alpha,
                beta=self.args.beta
)

meier-johannes94 avatar Mar 25 '22 17:03 meier-johannes94

Hmm the current code does what your said. However, the best way is to deal with reset/step together. If the underlying env has auto-reset env wrapper, i.e., step(*) == reset() when done == True, we can only call venv.step() and there's no need to call venv.reset() in the middle.

I'm trying to plug-in EnvPool's async step API but I don't have time until May. EnvPool's approach.

Trinkle23897 avatar Mar 25 '22 19:03 Trinkle23897