Oleg Zabluda

Results 47 comments of Oleg Zabluda

This somewhat explains how Intel's Multi-node Caffe works https://github.com/BVLC/caffe/pull/3252

> @scott-gray>Yes this is F(2x2,3x3). [...] I'm able to fit this all in one block for K=32 and 4 overlapping coordinates of x,y each with with 8 units of minibatch....

> I like to set the power limit at 275 But Titan-X doesn't allow setting power limits (275W is default). The only way I found to meaningfully benchmark Titan-X, is...

BTW, see commit https://github.com/soumith/convnet-benchmarks/commit/d6177f97e61da0d98a528f355086eb2fc05fe7b8 for how Soumith is doing warmup for both nervana and cudnn and its effect

> There seem to be power and speed optimizations depending on special cases of zero. very interesting. This must be in hardware. I wonder if the hardware automatically uses less...

FWIW, the following shows data-dependent (all-zero vs all-one) 7% difference of power consumption on integer matrix multiplication on AVR. Titan-X is likely to have the same effect, IMO, even without...

> One thing I want to point out is that I'll have a completely new set of kernels out soonish, and these do a much better job of keeping data...

> t's clear the more things are toggling on the chip the more power it draws. It's also possible that additional power is saved when an all zero condition is...

In a workstation, it's a standard thing you can find in the gamers' forums and magazines, for example, see http://www.maximumpc.com/a-beginners-guide-to-liquid-cooling/. In a standard rack-mounted server, there is no room for...

@scott-gray: Awesome, as always! What do you consider "fully fused kernel", compared to a "partially fused kernel", and what is its real advantages? Guarantee that the data is available in...