simple_dqn
simple_dqn copied to clipboard
slow training speed in latest code?
I just tried the latest code, and found the training speed slowed down significantly, it used to be more than >200 steps_per_second, but right now it's ~100 steps_per_second
2017-09-24 15:08:08,844 Epoch #169 2017-09-24 15:08:08,844 Training for 250000 steps 2017-09-24 15:43:51,299 num_games: 1101, average_reward: 24.793824, min_game_reward: 0, max_game_reward: 400 2017-09-24 15:43:51,299 last_exploration_rate: 0.100000, epoch_time: 2143s, steps_per_second: 116 2017-09-24 15:43:51,300 Saving weights to snapshots/breakout_169.prm 2017-09-24 15:43:51,325 Testing for 125000 steps 2017-09-24 15:54:51,175 num_games: 70, average_reward: 240.414286, min_game_reward: 31, max_game_reward: 426 2017-09-24 15:54:51,175 last_exploration_rate: 0.050000, epoch_time: 660s, steps_per_second: 189
I wonder what recent change may caused the downgrade.
Unfortunately I'm not actively working on this codebase any more. But I would be happy to accept PR. Leaving this open till then.
Hello! What is the usual training speed on CPU? I have only 7 steps_per_second (14 step_per_second with MKL) and it is so slow. Is it possible to improve the CPU performance for this algorithm?
Unfortunately no, you definitely need GPU for training.
@tambetm, is it the issue of the Q-learning algorithm itself or of the current implementation? I mean, do you have an idea whether I could reach higher performance with Q-learning on CPU using other tools and frameworks?
All reinforcement learning algorithms are compute intensive. Asynchronous Advantage Actor-Critic (A3C) is the one that can be more easily parallelized to use multiple CPUs. Search "a3c github" for some example implementations.
How do you find the steps per second, i am also running on a CPU and just get this output that hangs:
./train.sh Breakout-v0 --environment gym No handlers could be found for logger "gym.envs.registration"
But top shows its processing away
it's on GPU.
So it's not possible to run at all on cpu?
Your error refers to logging and shouldn't stop it from proceeding. The training is just slow, so it might take a while to print out training statistics. I you can run pre-trained Pong and Breakout models against the game without GPU.