cudarray
cudarray copied to clipboard
Any way to use MKL speedup in numpy?
Hi all,
I've tried accelerating cudarray computations with anaconda's mkl packages and I didn't see any speedup at all (python still used only one core). Is there a way to make use of the acceleration?
Thanks!
Daniel
For fully-connected architectures, you should make sure that MKL parallelizes the matrix multiplications across multiple cores. Maybe it isn't configured correctly? For convolutional architectures you are out of luck. The convolutional operations in CUDArray are pretty lousy. I suspect a substantial speedup could be obtained by using matrix multiplications like in Caffe. I don't have time to implement it at the moment.