cucim icon indicating copy to clipboard operation
cucim copied to clipboard

Slowness seen using PiecewiseAffineTransform compared to scikit-image version

Open JHancox opened this issue 1 year ago • 2 comments

Describe the bug The cucim.skimage.transform.PiecewiseAffineTransform seems to be several times slower than the scikit-image equivalent

Steps/Code to reproduce bug When running the code below, I observe a 8x slowdown for the estimate and 2x slowdown for the warp operations using the PyTorch 24.01 container with cucim 23.12

Expected behavior The code should execute at least as fast as the cpu version

Environment details (please complete the following information): Docker on Ubuntu 22.04 PyTorch 24.01 container with scikit-image and cucim 23.12 pip installed

Additional context

`import matplotlib.pyplot as plt
from skimage.transform import PiecewiseAffineTransform, warp
from scipy.interpolate import LinearNDInterpolator
import numpy as np
from timeit import default_timer as timer
from cucim.skimage.transform import PiecewiseAffineTransform as cu_PAT
from cucim.skimage.transform import warp as cu_warp
import cupy as cp
   
# create some offsets and coordinates
vectors = np.array([[3.0,1.0],[-5.,-1.3],[-3.5,8.3],[0,0],[0,0],[0,0], [0,0]])
coords = np.array([[20,20],[180,50],[20, 180],[0,0],[0,255],[255,0], [255,255]])

# Create grid
step_size = 20
x = np.linspace(0, 255, num=step_size)
y = np.linspace(0, 255, num=step_size)
X, Y = np.meshgrid(x, y)

interpx = LinearNDInterpolator(list(coords), vectors[:,0])
Zxi = interpx(Y, X)

interpy = LinearNDInterpolator(list(coords), vectors[:,1])
Zyi = interpy(Y, X)

# create an array of coords
src = np.column_stack((X.reshape(-1), Y.reshape(-1)))

# add the interpolated offets
dst_rows = X + Zxi
dst_cols = Y + Zyi

dst = np.column_stack([dst_cols.reshape(-1), dst_rows.reshape(-1)])

# compute transforms
tform = PiecewiseAffineTransform()

start = timer()
tform.estimate(src, dst)
print("cpu estimate took {}s".format(timer()-start))

start = timer()
out = warp(imgrid, tform, output_shape=(255, 255))
print("cpu warp took {}s".format(timer()-start))

# repeat using cupy/cucim.skimage
cu_tform = cu_PAT()
start = timer()
cu_tform.estimate(cp.array(src), cp.array(dst))
print("gpu estimate took {}s".format(timer()-start))

start = timer()
out = cu_warp(cp.array(imgrid), cu_tform, output_shape=(255, 255))
print("gpu warp took {}s".format(timer()-start))
`

JHancox avatar Feb 08 '24 10:02 JHancox