dask-image icon indicating copy to clipboard operation
dask-image copied to clipboard

unnable to wrap function with da.as_gufunc for cupy array

Open lrlunin opened this issue 2 years ago • 7 comments

Describe the issue: I was happy to use dask-image with processing many images with CPU. Now I want to look whether this task can be rather accelerated with GPU. Previously I defined my custom function in this way:

@da.as_gufunc(signature=f"(i,j),(),()->(i,j)", output_dtypes=np.int16, vectorize=True)
def cpu_convolve_peak(img, th, dis):
    mask_hor_np = np.ones([1,2])
    clusters = np.fftconvolve(img, mask_hor_np, mode='same')
    # doing some further stuff with clusters as local peak search etc
    return clusters

and applied it to images loaded with dask_image.imread.imread("...", arraytype="numpy" ) and it was perfekt.

Now I loading my images with cp_images = dask_image.imread.imread("...", arraytype="cupy" ) and have a python function which doing some sort of complex things inside using GPU-only functions inside.

@da.as_gufunc(signature=f"(i,j),(),()->(i,j)", output_dtypes=cp.int16, vectorize=True)
def gpu_convolve_peak(img, th, dis):
    mask_hor_cp = cp.ones([1,2])
    clusters = cuscipy.fftconvolve(img, mask_hor_cp, mode='same')
    # doing some further stuff with clusters as local peak search etc
    return clusters

When now I try to execute the following code:

gpu_convolve_peak(cp_images, 100, 2).compute()

or either apply more functions at the end, for example:

gpu_convolve_peak(cp_images, 100, 2).sum(axis=0).compute()

I will get an error: TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy array explicitly.

It seems like dask inside tries to handle the array as numpy array despite it is cupy array. I would also ask whether my approach is correct in a sense of CUDA architecture. When I executing a cupy function it will work only in a "single core mode" on the GPU, right? I mean that GPU doesn't parallelize it itself and I need to run more instances to get this kind of "GPU multicore supremacy" with all these 2000+ CUDA cores on my GPU. I would be very thankful if you correct me and tell what the right way would be.

Thank you so much for your beautiful project and nice code!

Environment:

  • Dask-image version: 2022.9.0
  • Cupy version: 11.3
  • Python version: 3.8.15
  • Operating System: Ubuntu 20.04.4 LTS x86_64
  • Install method (conda, pip, source): conda
  • CUDA version: 11.5
  • GPU: Nvidia GTX 1070Ti

lrlunin avatar Nov 28 '22 21:11 lrlunin