tianshou
tianshou copied to clipboard
AsyncCollector seems not to collect asynchronous
This is more like a question, whether I made some mistake. I am using Dueling DQN and build my training procedure in a similar way like your atari DQN example. However resets of my environment instances take a long time. Therefore I am using an AsyncCollector and SubprocVectorEnv. I have the impression that when one environment resets the others wait till it is finished and then the next step in the envrionment is collected for all. However I was expecting that the AsyncCollector is able to handle such a situation more intelligently. Is there a mistake or a misunderstanding from my side?
Here a short part of my code:
env = return SubprocVectorEnv(
[self.create_single_env(apartments[i])
for i in range(self.args.parallel_envs)]
)
self.train_collector = AsyncCollector(
self.policy, self.envs, self.buffer, exploration_noise=True
)
I use a prioritized replay buffer:
return PrioritizedVectorReplayBuffer(
self.args.buffer_size,
buffer_num=len(self.envs),
ignore_obs_next=True,
alpha=self.args.alpha,
beta=self.args.beta
)
Hmm the current code does what your said. However, the best way is to deal with reset/step together. If the underlying env has auto-reset env wrapper, i.e., step(*) == reset()
when done == True
, we can only call venv.step() and there's no need to call venv.reset() in the middle.
I'm trying to plug-in EnvPool's async step API but I don't have time until May. EnvPool's approach.