milesbrundage
milesbrundage
Sounds good! Just started a new run with epsilon annealing and a lot of hyperparameter changes... will see how that goes and send a pull request if it goes well.
I unfortunately haven't had time to do frame skip or grayscaling yet, but have been running training on Breakout with changes to hyperparameters and with epsilon annealing for about 7000...
(had an error in the above the first time I posted it, but just fixed - my computer has crashed a few times while running this so sometimes I've changed...
I also just saw that the description of the Breakout environment (and the other Atari environments) seems to suggest actions are already automatically repeated, though not sure how this should...
Re: number 1, note that 1,000,000 refers to frames, whereas 50 in this code refers to episodes. I am currently trying a 1,000 episode memory (among other hyperparameter changes) and...