Andrew Lavin

Results 61 comments of Andrew Lavin

@jbalma yes I find strong scaling the training problem to be very interesting, and of course that is an active area of research. I hope you find a good forum...

Hi @nomi-wei , just a clarification: our fast convnet algorithms use [Winograd's _convolution_ algorithms](https://www.encyclopediaofmath.org/index.php/Winograd_small_convolution_algorithm). But the same Shmuel Winograd did co-author the [Coppersmith-Winograd fast matrix multiplication algorithm](https://en.wikipedia.org/wiki/Coppersmith%E2%80%93Winograd_algorithm), so the confusion...

"With these optimizations time to train AlexNet\* network on full ILSVRC-2012 dataset to 80% top5 accuracy reduces from 58 days to about 5 days." The benchmark used dual E5-2699-v3 CPUs,...

So anyway the numbers Intel reported sound plausible, but your numbers don't. :-)

.. and having looked a bit at Caffe's CPU implementation, im2col is single-threaded, and will be a pretty nasty bottleneck in a 36-core system.

@moskewcz your numbers sound plausible to me.. and so Intel's post really points to what a disaster out of the box Caffe performance must be on many-core CPUs.

@ozabluda Ah, I did not know about this feature of Xeon processors, thanks. So it is Xeon only? soumith's Core(TM) i7-5930K will not have this? My i7-5775C seems to sustain...

I tracked down AVX base frequency specs for haswell e5 processors here: https://www.microway.com/knowledge-center-articles/detailed-specifications-intel-xeon-e5-2600v3-haswell-ep-processors/ Would be nice to find an official Intel source. I suspect this is only a feature of...

@soumith What command line did you use? README.txt says: ``` For timing #> ./build/tools/caffe time \ -iterations \ --model=models/intel_alexnet/train_val.prototxt ``` When I run that on my 4-core i7-5775C I get:...