tf-plan icon indicating copy to clipboard operation
tf-plan copied to clipboard

GPU vs CPU performance

Open meneguzzi opened this issue 5 years ago • 1 comments

Hi @thiagopbueno,

I'm also working with @ramonpereira and @miquelramirez and I have been trying to run tf-plan in a Linux box with GPUs. However, in our experiments (the same domains as in issue #2 ), it seems that running the planner with tensorflow-gpu installed instead of plain tensorflow takes substantially longer.

In the example running

time tfplan lqr_instance_problem0.rddl -hr 100 -e 1000 -v --viz=generic -b 128 -lr 0.005 -m online

I have two different times at the end of the process, depending on whether I'm running with GPUs or not. First, with GPUs (either 1 or 3 GPUs does not change the times substantially)

real    193m34.868s
user    405m10.476s
sys     40m27.540s

Whereas the times for running with CPUs are:

real    66m31.023s
user    208m0.132s
sys     55m38.096s

This is really weird to me, as the GPU time is almost 3 times slower than running on CPUs.

I don't know if I would classify this as a bug, I would think this is a call for enhancement. Given that many of the domains one would use RDDL to solve are very complex, it would be great to be able to leverage Tensorflow's GPU speed up to the max.

meneguzzi avatar Dec 13 '18 12:12 meneguzzi

Hi @meneguzzi,

Thanks for letting me know that.

I haven't tested tf-plan in a GPU-based platform yet. So, to be honest, at this point, I can only speculate on the reason why the times are so different.

I don't know if you are relying on automatic device placement, but if that's the case, it might be possible that some computations being carried out during tf-plan training are simply not efficient to be executed in GPUs. Please note that the computations in tf-plan are domain-dependent and therefore potentially very different in nature and structure from a typical NN inference/training computations.

Eventually, it would be nice to profile the tf-plan's execution w/ and w/o GPUs and compare/visualize them in tensorboard to try to pinpoint which parts of the graph are taking more time on a GPU.

Unfortunately, I won't have time to do this anytime soon. But let me know if are willing to invest some time and need any help or if you have any other suggestion or consideration.

thiagopbueno avatar Jan 23 '19 16:01 thiagopbueno