DQN-tensorflow icon indicating copy to clipboard operation
DQN-tensorflow copied to clipboard

Slower than deep_q_rl

Open LinZichuan opened this issue 8 years ago • 8 comments

Hi, I found that this implementation is slower than deep_q_rl which is implemented by theano. Is it because this repo used openai gym rather than rom files? Or the performance between Thesorflow and Theano? Or any other details?

deep_q_rl runs 100-200 steps at learning process. But DQN-tensorflow just runs 70-90 steps at learning process. It makes the training slow, and cannot run 200M in 10 days as dqn nature paper.

LinZichuan avatar Feb 13 '17 05:02 LinZichuan

Looks like it's currently not using GPU efficiently #21

ppwwyyxx avatar Feb 13 '17 06:02 ppwwyyxx

I ran some experiments a while ago, and observed the same thing.

In my experiments (I did not using this repo), everything other than underlying NN library were same, and mini-batch was fed to GPU in the same manner (send mini-batch every time network is updated.) and yet Theano was faster than Tensorflow.

According to #2919 #3377, Tensorflow's seesion.run method does things more than just feeding data to GPU. thus I guess that is adding overhead and making the training slower than Theano.

mthrok avatar Feb 13 '17 07:02 mthrok

@mthrok The two issues are saying that feed_dict is slow (not session.run is slow). It's actually a good practice to avoid using feed_dict inside training loops to reduce overhead compared to other frameworks.

ppwwyyxx avatar Feb 13 '17 13:02 ppwwyyxx

Has anyone solve the training slower problems? in my case, it almost 600hours training.

Lan1991Xu avatar Mar 17 '17 11:03 Lan1991Xu

@ppwwyyxx So nice to see you here, Tensorpack author, why this repo's performance differ so much from your samples in tensorpack. I don't see major difference but this repo collect experince replay in the same thread with training. But does it matter? Or because you used ROM directly?

quhezheng avatar Jun 03 '17 12:06 quhezheng

I don't know why. Maybe the use of feed_dict is the major reason. Using thread improves speed but not significant in my case. Using rom should make no difference to speed.

ppwwyyxx avatar Jun 06 '17 09:06 ppwwyyxx

@ppwwyyxx I failed to describe the issue clearly. The issue I run into this reop is the training best rewards is 30, no where to compare with the sample in your repo. I compared code but don't see major difference. I changed your sample by replacing ROM direcly with Gym, no code change with network or training, unfortunatly, the output from your code is as bad as this repo, the best reward simply 50 and doesn't make any progress any more after million steps.

So I guess the Gym envirement itself has bugs. But ROM is free of the issue, I though you were aware of this issue so you use ROM instead of Gym

quhezheng avatar Jun 20 '17 01:06 quhezheng

It's not a bug. Gym environments (**-v0) is just a harder setting because it has more randomness. You can use other gym settings e.g. BreakoutDeterministic-v4 is closest to a naive atari wrapper. However even with Breakout-v0, you should still be able to see better performance than 50.

ppwwyyxx avatar Jun 20 '17 07:06 ppwwyyxx