tensorflow-rl icon indicating copy to clipboard operation
tensorflow-rl copied to clipboard

Training slowing down dramatically

Open ionelhosu opened this issue 8 years ago • 1 comments
trafficstars

Did anyone face the issue of the training process slowing down? For example, training one DQN-CTS worker on Montezuma's Revenge runs at about 220 iter/sec after 100.000 steps and 35 iter/sec after 400.000. Any thoughts? Thank you.

ionelhosu avatar Oct 30 '17 14:10 ionelhosu

Hi @ionelhosu, I think when it's running at 220 iter/sec the training hasn't actually started yet; it's just filling the replay buffer until it reaches some minimum size. That explains the slowdown, but it is unexpected just how slow that training updates are. I'd like to do some tensorflow profiling to spot what the bottleneck is here, but if you find anything interesting on your own please let me know.

steveKapturowski avatar Nov 09 '17 21:11 steveKapturowski