simple_dqn icon indicating copy to clipboard operation
simple_dqn copied to clipboard

slow training speed in latest code?

Open mw66 opened this issue 7 years ago • 9 comments

I just tried the latest code, and found the training speed slowed down significantly, it used to be more than >200 steps_per_second, but right now it's ~100 steps_per_second

2017-09-24 15:08:08,844 Epoch #169 2017-09-24 15:08:08,844 Training for 250000 steps 2017-09-24 15:43:51,299 num_games: 1101, average_reward: 24.793824, min_game_reward: 0, max_game_reward: 400 2017-09-24 15:43:51,299 last_exploration_rate: 0.100000, epoch_time: 2143s, steps_per_second: 116 2017-09-24 15:43:51,300 Saving weights to snapshots/breakout_169.prm 2017-09-24 15:43:51,325 Testing for 125000 steps 2017-09-24 15:54:51,175 num_games: 70, average_reward: 240.414286, min_game_reward: 31, max_game_reward: 426 2017-09-24 15:54:51,175 last_exploration_rate: 0.050000, epoch_time: 660s, steps_per_second: 189

I wonder what recent change may caused the downgrade.

mw66 avatar Sep 24 '17 16:09 mw66

Unfortunately I'm not actively working on this codebase any more. But I would be happy to accept PR. Leaving this open till then.

tambetm avatar Sep 25 '17 06:09 tambetm

Hello! What is the usual training speed on CPU? I have only 7 steps_per_second (14 step_per_second with MKL) and it is so slow. Is it possible to improve the CPU performance for this algorithm?

iNomaD avatar Oct 19 '17 19:10 iNomaD

Unfortunately no, you definitely need GPU for training.

tambetm avatar Oct 19 '17 19:10 tambetm

@tambetm, is it the issue of the Q-learning algorithm itself or of the current implementation? I mean, do you have an idea whether I could reach higher performance with Q-learning on CPU using other tools and frameworks?

iNomaD avatar Oct 19 '17 20:10 iNomaD

All reinforcement learning algorithms are compute intensive. Asynchronous Advantage Actor-Critic (A3C) is the one that can be more easily parallelized to use multiple CPUs. Search "a3c github" for some example implementations.

tambetm avatar Oct 19 '17 20:10 tambetm

How do you find the steps per second, i am also running on a CPU and just get this output that hangs:

./train.sh Breakout-v0 --environment gym No handlers could be found for logger "gym.envs.registration"

But top shows its processing away

mcbrs1a avatar Nov 24 '17 19:11 mcbrs1a

it's on GPU.

mw66 avatar Nov 25 '17 18:11 mw66

So it's not possible to run at all on cpu?

mcbrs1a avatar Nov 26 '17 22:11 mcbrs1a

Your error refers to logging and shouldn't stop it from proceeding. The training is just slow, so it might take a while to print out training statistics. I you can run pre-trained Pong and Breakout models against the game without GPU.

tambetm avatar Nov 27 '17 07:11 tambetm