Soumith Chintala comments

Results 312 comments of


                                            Soumith Chintala

[August 2014] Discussion of results

propagate_down is indeed true! [This is the benchmark code, and the place where it is set to true.](https://github.com/BVLC/caffe/blob/b1c4f121b4f01b538eef6997ba3af6c9a71afd31/tools/net_speed_benchmark.cpp#L90)

[August 2014] Discussion of results

yes, it does that here: https://github.com/BVLC/caffe/blob/master/src/caffe/layers/conv_layer.cu#L79

[August 2014] Discussion of results

if we want a double confirmation, we can ask @rbgirshick who is watching this repo. Ross, what do you say?

[August 2014] Discussion of results

> Isn't SpatialConvolutionMM supposed to be using the same tricks anyhow? Yes, SpatialConvolutionMM is borrowed from Caffe (thanks guys), but we are using the cublas v1 interface (whereas Caffe uses...

[August 2014] Discussion of results

@f0k that does seem to be a good point! the utility I use is Caffe's own, and looking through (https://github.com/BVLC/caffe/blob/master/include/caffe/layer.hpp), I don't see an explicit synchronize either. I'll have to...

[August 2014] Discussion of results

> but maybe the other benchmarks should do the same to ensure only the GPU time is measured, or the caffe benchmark should time "from the outside" just like the...

[August 2014] Discussion of results

I benchmarked theano, torch, ccn2 and caffe with cuda 6.5. Also Alex updated ccn2 to add some more performance improvements. Also, fixed theano benchmark to average over 10 iterations (like...

[August 2014] Discussion of results

right, fixed.

[August 2014] Discussion of results

Thanks to @nouiz who fixed an issue with theano fft gradInput, now Theano FFT implementation is 1.5x faster than Caffe (which is 2nd place). Now is the time to also...

[August 2014] Discussion of results

@stencilman it depends on how transparent you want it to be (multigpu can already be done in torch if you are careful with a few things), but this is not...