caffe2 icon indicating copy to clipboard operation
caffe2 copied to clipboard

OpenCL support

Open pietern opened this issue 8 years ago • 13 comments
trafficstars

Master issue to track OpenCL support.

@danzimm -- if you end up issuing some PRs, please mention this issue. Thanks :100: :+1:

pietern avatar May 23 '17 15:05 pietern

cc @bwasti

Yangqing avatar May 24 '17 15:05 Yangqing

Would the goal be to support AMD, or to support all GPUs (including Intel and ARM), or even also FPGAs and DSPs?

VincentSC avatar Jun 01 '17 08:06 VincentSC

@pietern Would you like any help and contributions?

We have many years of experience with optimising applications for server, mobile and embedded OpenCL accelerators (especially for the market-dominant ARM Mali and Qualcomm Adreno GPUs), as well as tuning closed- and open-source compute libraries for Caffe1 and other DNN frameworks (e.g. see this public Jupyter Notebook).

Most importantly, we have unique expertise on how to achieve OpenCL performance portability (no mean feat!) across diverse operating environments (Android, Linux, Windows), device architectures (CPUs, GPUs, DSPs, custom accelerators), data inputs (sizes, shapes, patterns), etc.

psyhtest avatar Jun 04 '17 12:06 psyhtest

Hi @Yangqing ,

Hugh Perkins has created Coriander that could run NVIDIA® CUDA™ code on OpenCL 1.2 devices. You might want to take a look if that suits your need. Kindly attribute his name and his contribution in case if you plan to use his work.

viper7882 avatar Jun 20 '17 15:06 viper7882

@Yangqing @bwasti @pietern Guys, are you open to contributions, or should the OpenCL community be content with contributing to Caffe1?

psyhtest avatar Jul 11 '17 08:07 psyhtest

@psyhtest any idea for caffe1 opencl on MALI OR Adreno GPU, we run caffe1 with OPENCL on MALI or Adreno GPU with android ENV , we find opencl cost so much time to run finish kernel

run fcn net, ARM*8 CPU only take 5S, but use MALI T8 GPU with opencl will take about 25S to onetime iter

haolongzhangm avatar Oct 26 '17 02:10 haolongzhangm

Is it planned to support OpenCL 1.1? Or just only 2.0 and above?

I heard 2.0 adds many features that CUDA had but OpenCL hadn't.

quartzsaber avatar Nov 06 '17 16:11 quartzsaber

@haolongzhangm Apologies, I've only just read your message here.

Which OpenCL math library do you use with Caffe? ViennaCL and clBLAS are not optimised for Mobile. CLBlast can be tuned with very good results.

Also, are you using FCN-16 by any chance? I found this network to be a real killer for mobile GPUs, taking seconds for a single convolution layer even with adequately optimised code.

psyhtest avatar Jan 06 '18 00:01 psyhtest

@Yangqing Any plans to support OpenCL in Caffe2?

psyhtest avatar Jan 06 '18 00:01 psyhtest

There is work being done by ROCm (https://rocm.github.io/index.html) on Caffe2 at https://github.com/ROCmSoftwarePlatform/caffe2 for OpenCL. Feel free to take a look at that as well.

orionr avatar Jan 10 '18 18:01 orionr

@orionr ROCm is not OpenCL. This will not work on any other devices than AMDGPU-PRO. It's based on HIP, an AMD drop-in replacement for CUDA.

naibaf7 avatar Jan 20 '18 18:01 naibaf7

Is this support still contribution welcomed? I'm interested in OpenCL support on caffe2.

nuka137 avatar Mar 25 '18 06:03 nuka137

For deep learning inference on mobile devices with GPU/OpenCL support, you can checkout MACE, which supports Adreno, Mali and PowerVR GPUs. Here are some benchmark results.

llhe avatar Jul 17 '18 07:07 llhe