dlbench icon indicating copy to clipboard operation
dlbench copied to clipboard

Contact libraries authors for possible enhancements

Open Randl opened this issue 8 years ago • 10 comments

I'm Tensorflow user, so I've opened an issue regarding performance of Tensorflow in several cases.

One of things we found out is that the code used by dlbench is suboptimal - https://github.com/tensorflow/tensorflow/issues/7187#issuecomment-278530703

So I thought you might consider to contact other libraries authors too, to get feedback from them

Randl avatar Feb 10 '17 10:02 Randl

Thanks for your suggestion. We have tried to contact the authors of other tools to confirm the scripts and configuration files. Feel free to submit your pull request if you have optimal implementations of our tested networks.

shyhuai avatar Feb 10 '17 15:02 shyhuai

@Randl The ResNet-50 script in MXNet was found a mistake configuration, and we have revised it to the correct one. So we start to re-run the revised script to generate new results. Could you also provide the tensorflow code that avoids to use feed_dict=feed_dict in FCN so that we can release the newer results together. Please be noted that the TF version should be 0.11. Thank you!

shyhuai avatar Feb 10 '17 16:02 shyhuai

@shyhuai You should ask @tfboyd for optimal code.

Randl avatar Feb 10 '17 16:02 Randl

@shelhamer @KeDengMS @piiswrong @soumith Sorry if you're wrong people to tag. Do you have something to add? Do think the benchmark can be improved somehow or your framework isn't used in a most efficient way?

Randl avatar Feb 12 '17 21:02 Randl

Hi @shyhuai ,

I know how hard it is to run a bunch benchmarks using a wide range of tools. I do not know if I will have time to submit any PRs in the near future but I will if I can find time. One idea I did have that would make it easier for us to help. We do not do a lot with the CIFRA data sets because the image sizes are really small and GPUs end up processing in some cases 6,000+ samples (images) / sec. I understand moving to ImageNet could be a big change given you have done multiple rounds with CIFAR.

Good luck on future iterations. I cannot say I will always have time, but please feel free to reach out to me for code or whatever. I do not want to influence your results, but I am happy to help as impartially as I can.

tfboyd avatar Feb 12 '17 21:02 tfboyd

Oh sorry, one more thing. We should have an MNIST example that does not use the python feed soon. It was intended as a tutorial. I will try to submit a PR or at a minimum link it to you when it is released.

edit: will have a MNIST example soon.

tfboyd avatar Feb 12 '17 22:02 tfboyd

@tfboyd Thank you very much for your kindly response and help. We also try to include real ImageNet data set into evaluation, but it could take more time to generate results since it takes several days to train a network model converged. I will inform you if we have further progress.

shyhuai avatar Feb 13 '17 03:02 shyhuai

@shihuai, appreciate your effort on building benchmarks for major DL platforms. Please let me know if you found any issues in testing CNTK. As to CIFAR vs. ImageNet, I think having both would be beneficial to measure the speed of computation and I/O separately. CIFAR-10 is a small data set, but one can still build complex networks on it, like ResNet110 in CNTK. That would be a very good indicator of how the platform performs when computation is intensive. ImageNet would put more pressure on I/O comparing to CIFAR-10.

ke1337 avatar Feb 13 '17 22:02 ke1337

Maybe it worth looking something in between ImageNet and CIFAR, like Pascal VOC dataset?

cepera-ang avatar Apr 13 '17 05:04 cepera-ang

I rewrote all of the TensorFlow examples with the exception of the RNN. I think this can be closed once the PRs are accepted. I suspect our ResNet is still off as there should not be as large of a gap between any of the platforms on one or even multiple GPUs especially a K80. they should all be with in about 5-10% maybe 20% in some weird cases but in general and as tested by NVIDIA the top frameworks are nearly identical (yes some are faster and slower but not dramatically) with CNNs. RNNs might be a different story but if everyone is using cuDNN again it should be similar and not dramatically different.

tfboyd avatar Jun 01 '17 15:06 tfboyd