YOSHIFUJI Naoki

Results 38 comments of YOSHIFUJI Naoki

Fact: OpenCL (runtime header cl.h) has its **version** (1.2, 2.0, 2.1 etc). Therefore, we should use same version's header as SDK's version to develop OpenCL software safely. Am I wrong?...

How about in the case of CuPy? CuPy also stores ndarray to CArray.

#163 reported `clWaitForEvents` was one of the problems.

This article (in Japanese) could help us https://qiita.com/shu65/items/42914bd2ad01d1e323da

@vorj @ybsh we need NVIDIA GPU because we need to know why clpy is slower than cupy

If we have no way to profile on NVIDIA GPUs, you must use other profiling tools for Python and Cython itself.

I'm not soooo good at OpenCL profiling itself. Why don't you just try it?

@ybsh I don't know why but I think you don't need to know the reason to solve this issue. Do you need?

@ybsh Now you confirmed the fact "ClPy is slower than CuPy", even with other configs/settings than the last report. So please keep on doing your work with your configs/setting!