Deep-Reinforcement-Learning-Algorithms-with-PyTorch
Deep-Reinforcement-Learning-Algorithms-with-PyTorch copied to clipboard
Seeding in 'reset_game'
I have a question regarding the method reset_game in Base_Agent. The first few lines read:
def reset_game(self):Calling seed
"""Resets the game information so we are ready to play a new episode"""
self.environment.seed(self.config.seed)
self.state = self.environment.reset()
I am concerned about the seeding. If I understand correctly, 'reset_game' is called any time an episode is completed.
Assume we implement the seed method in our environment like this:
def seed(self, seed=None):
self.np_random, seed = seeding.np_random(seed)
return [seed]
This is actually the method used in Bit_Flipping_Environment.
If we were to actually use self.np_random for resetting the environment, we would always see the same initial state over and over again. At least that is the behaviour I appear to be experiencing.
The environments implemented in this repository, e.g. Bit_Flipping_Environment, seems to circumvent this issue by not using self.np_random at all. Instead, the random module is used. In fact, I don't quite understand why np_random is a member of Bit_Flipping_Environment at all.
Correct me if I'm wrong, but doesn't this make the use of seeds completely pointless (because random is not seeded)?
I would have expected the environment seed method exactly once per run. Calling it once per episode simply doesn't make sense to me.
Please correct me if I misunderstood anything but this doesn't seem right to me.
Best, Markus
Hi, I’d have to check it but it sounds like I made a mistake and you are correct it shouldn’t be there