Deep-Reinforcement-Learning-Algorithms-with-PyTorch icon indicating copy to clipboard operation
Deep-Reinforcement-Learning-Algorithms-with-PyTorch copied to clipboard

Seeding in 'reset_game'

Open Markus28 opened this issue 4 years ago • 1 comments
trafficstars

I have a question regarding the method reset_game in Base_Agent. The first few lines read:

    def reset_game(self):Calling seed
        """Resets the game information so we are ready to play a new episode"""
        self.environment.seed(self.config.seed)
        self.state = self.environment.reset()

I am concerned about the seeding. If I understand correctly, 'reset_game' is called any time an episode is completed.

Assume we implement the seed method in our environment like this:

    def seed(self, seed=None):
        self.np_random, seed = seeding.np_random(seed)
        return [seed]

This is actually the method used in Bit_Flipping_Environment.

If we were to actually use self.np_random for resetting the environment, we would always see the same initial state over and over again. At least that is the behaviour I appear to be experiencing. The environments implemented in this repository, e.g. Bit_Flipping_Environment, seems to circumvent this issue by not using self.np_random at all. Instead, the random module is used. In fact, I don't quite understand why np_random is a member of Bit_Flipping_Environment at all. Correct me if I'm wrong, but doesn't this make the use of seeds completely pointless (because random is not seeded)?

I would have expected the environment seed method exactly once per run. Calling it once per episode simply doesn't make sense to me. Please correct me if I misunderstood anything but this doesn't seem right to me.

Best, Markus

Markus28 avatar Jun 01 '21 14:06 Markus28

Hi, I’d have to check it but it sounds like I made a mistake and you are correct it shouldn’t be there

p-christ avatar Jun 01 '21 14:06 p-christ