libdnn icon indicating copy to clipboard operation
libdnn copied to clipboard

Is libdnn support Mali gpu?

Open 295988101 opened this issue 9 years ago • 4 comments

I use caffe-opencl with Mali gpu but I seems that libdnn can not support Mali. Actually, I want to make some optimization in opencl kernel for some operate such as element-wise multiplication. You have do some memory optimization in libdnn of opencl kernel. But as I know, the memory of opencl in mali just use CL_MEM_ALLOC_HOST_PTR .. for cpu data. would you tell me the method libdnn use for memory optimization or show me some resources about this.

thank you

295988101 avatar Sep 21 '16 09:09 295988101

@zhenghuitian I suggest you read those two excellent articles first:

  • https://github.com/clMathLibraries/clBLAS/wiki/AutoGemm
  • http://www.cedricnugteren.nl/tutorial.php

Other than that, you mainly have to find out the required FLOPS per global memory read/write to fully occupy the chip, as well as memory reading/writing strides for the individual threads.

Now, GEMM and convolution are quite difficult to get exactly right, for element-wise operations it's much easier.

LibDNN is mainly developed for desktop class GPUs (AMD RX480, W9100 and nVidia GTX980, 1080) at the moment. The problem with mobile chips and Intel chips (for which we have separte spatial kernels in Caffe) is the reduced memory bandwidth, less or no local memory at all, and other tricks you have to apply

naibaf7 avatar Sep 21 '16 10:09 naibaf7

@naibaf7 thank you. I have read those two articles, but the main contents is not for mali gpu. I have read other articles for mali, and follow those, without local memory, I use vector, half, vload and other methods to make it faster, and it is useful.

295988101 avatar Sep 24 '16 07:09 295988101

@naibaf7 but I do not understand the meaning of what you say "you mainly have to find out the required FLOPS per global memory read/write to fully occupy the chip, as well as memory reading/writing strides for the individual threads." and how to do that?

295988101 avatar Sep 24 '16 07:09 295988101

I've opened a specific issue at https://github.com/naibaf7/libdnn/issues/18

bhack avatar Dec 03 '16 10:12 bhack