gym
gym copied to clipboard
Why the initial state of breakout environment is the same with different seed?
I tried the bellowing code and found out the initial state of breakout environment is the same with different seed. I wonder why? And how to get a different initial state?
import gymnasium as gym
import numpy as np
for s in [0,1,2,3,4]:
env=gym.make("BreakoutNoFrameskip-v4")
observation, info = env.reset(seed=s*1000)
print(s, np.sum(observation))
I wouldn't be surprised if Breakout always starts in the same state but the randomness will still affect what happens
Thanks for the clarification! In this link that sampling action is not decide by the env seed. So I wonder what are the impacts of the different seed in breakout env? Also are there method to manually modify the initial state? Thanks in advance for your help!
For atari you can access the ram state and modify it I believe
Thanks! I wonder what are the impacts of different environment seeds?