pyvkfft
pyvkfft copied to clipboard
pyvkfft.fft.fftn error?
Hi! I have found that pyvkfft.fft.fftn exhibits strange behavior. Also, in previous versions of pyvkfft, this code raises an error.
OS:Windows 10 Python: 3.12.2
import numpy as np
import cupy as cp
import pyvkfft.fft
shape = (3,2,2)
size = np.prod(shape)
x1 = cp.arange(size,dtype=cp.complex64).reshape(shape)
x2 = x1[:,None]
y1 = pyvkfft.fft.fftn(x1,axes=(-1,-2))
y2 = pyvkfft.fft.fftn(x2,axes=(-1,-2)).squeeze()
print("shape x1:",x1.shape, "shape x2:", x2.shape)
print("shape y1:",y1.shape, "shape y2:", y2.shape)
print(y1-y2)
case (i) pyvkfft == 2024.1.1 vkfft == 1.3.4
shape x1: (3, 2, 2) shape x2: (3, 1, 2, 2)
shape y1: (3, 2, 2) shape y2: (3, 2, 2)
[[[ -9.+0.j 1.+0.j ]
[ 2.-3.4641016j 0.+0.j ]]
[[ 28.+3.4641016j -2.+0.j ]
[-55.+0.j 3.+0.j ]]
[[ 44.-3.4641016j -2.+0.j ]
[ 2.+3.4641016j 0.+0.j ]]]
case (ii) pyvkfft == 2023.1.1 vkfft == 1.2.9
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
Cell In[2], line 8
5 x2 = x1[:,None]
7 y1 = pyvkfft.fft.fftn(x1,axes=(-1,-2))
----> 8 y2 = pyvkfft.fft.fftn(x2,axes=(-1,-2)).squeeze()
10 print("shape x1:",x1.shape, "shape x2:", x2.shape)
11 print("shape y1:",y1.shape, "shape y2:", y2.shape)
File [~\miniconda3\Lib\site-packages\pyvkfft\fft.py:200](http://localhost:8888/lab/tree/cupy_test/vkfft_error/~/miniconda3/Lib/site-packages/pyvkfft/fft.py#line=199), in fftn(src, dest, ndim, norm, axes, cuda_stream, cl_queue, return_scale)
167 """
168 Perform a FFT on a GPU array, automatically creating the VkFFTApp
169 and caching it for future re-use.
(...)
197 :return: the destination array if return_scale is False, or (dest, scale)
198 """
199 backend, inplace, dest, cl_queue = _prepare_transform(src, dest, cl_queue, False)
--> 200 app = _get_fft_app(backend, src.shape, src.dtype, inplace, ndim, axes, norm, cuda_stream, cl_queue,
201 strides=src.strides)
202 if backend == Backend.PYOPENCL:
203 app.fft(src, dest, queue=cl_queue)
File [~\miniconda3\Lib\site-packages\pyvkfft\fft.py:137](http://localhost:8888/lab/tree/cupy_test/vkfft_error/~/miniconda3/Lib/site-packages/pyvkfft/fft.py#line=136), in _get_fft_app(backend, shape, dtype, inplace, ndim, axes, norm, cuda_stream, cl_queue, strides)
134 @lru_cache(maxsize=config.FFT_CACHE_NB)
135 def _get_fft_app(backend, shape, dtype, inplace, ndim, axes, norm, cuda_stream, cl_queue, strides=None):
136 if backend in [Backend.PYCUDA, Backend.CUPY]:
--> 137 return VkFFTApp_cuda(shape, dtype, ndim=ndim, inplace=inplace,
138 stream=cuda_stream, norm=norm, axes=axes, strides=strides)
139 elif backend == Backend.PYOPENCL:
140 return VkFFTApp_cl(shape, dtype, cl_queue, ndim=ndim, inplace=inplace,
141 norm=norm, axes=axes, strides=strides)
File [~\miniconda3\Lib\site-packages\pyvkfft\cuda.py:123](http://localhost:8888/lab/tree/cupy_test/vkfft_error/~/miniconda3/Lib/site-packages/pyvkfft/cuda.py#line=122), in VkFFTApp.__init__(self, shape, dtype, ndim, inplace, stream, norm, r2c, dct, axes, strides, **kwargs)
121 raise RuntimeError("Error creating VkFFTConfiguration. Was the CUDA context properly initialised ?")
122 res = ctypes.c_int(0)
--> 123 self.app = _vkfft_cuda.init_app(self.config, ctypes.byref(res))
124 check_vkfft_result(res, shape, dtype, ndim, inplace, norm, r2c, dct, axes, "cuda")
125 if self.app is None:
OSError: exception: integer divide by zero
Thanks for the report, I can confirm this behaviour also with OpenCL:
import numpy as np
import pyopencl as cl
import pyopencl.array as cla
import pyvkfft.fft
from pyvkfft.base import calc_transform_axes
ctx = cl.create_some_context()
clq = cl.CommandQueue(ctx)
shape = (3,2,2)
size = np.prod(shape)
x0 = np.arange(size,dtype=np.complex64).reshape(shape)
x1 = cla.to_device(clq, x0)
x2 = cla.to_device(clq, x0[:,None])
y1 = pyvkfft.fft.fftn(x1,axes=(-1,-2)).get()
y2 = pyvkfft.fft.fftn(x2,axes=(-1,-2)).get() #.squeeze()
y0 = np.fft.fftn(x0, axes=(-1,-2))
print("shape x1:",x1.shape, "shape x2:", x2.shape)
print("shape y1:",y1.shape, "shape y2:", y2.shape)
print(abs(y1-y0).sum())
print(abs(y2.squeeze()-y0).sum())
which gives:
shape x1: (3, 2, 2) shape x2: (3, 1, 2, 2)
shape y1: (3, 2, 2) shape y2: (3, 1, 2, 2)
0.0
152.34962482055641
Interestingly the initial re-shuffling of the internal array shape passed to VkFFT (collapsing non-transformed arrays) gives the same results:
from pyvkfft.base import calc_transform_axes
print(calc_transform_axes((3,2,2), axes=(-1,-2)))
print(calc_transform_axes((3,1,2,2), axes=(-1,-2)))
([2, 2, 1], 3, [False, False, True], 2, (-1, -2))
([2, 2, 1], 3, [False, False, True], 2, (-1, -2))
So something else is going wrong...
Hmm, I think I understand what is happening - for an axis of size 1 the stride is zero:
print(x1.strides)
print(x2.strides)
(72, 24, 8)
(72, 0, 24, 8)
and pyvkfft will then assume the internal transform order must be changed, and this is where the error occurs:
print(calc_transform_axes((3,2,2), axes=(-1,-2), strides=x1.strides))
print(calc_transform_axes((3,1,2,2), axes=(-1,-2), strides=x2.strides))
([2, 2, 1], 3, [False, False, True], 2, (-1, -2))
([2, 1, 3, 1], 2, [False, True, False, True], 3, [-3, -1])
now I need to understand what is going wrong - probably the easiest solution is to ignore any 1-sized (0-strided) axis
Some background info about why an axis of size 1 has a stride=0: see https://github.com/numpy/numpy/issues/22950
OK, this should be fixed in the current git master.
The main issue was not, in fact, strides equal to zero (that's easily handled). And it is not possible to just ignore (squeeze) axes of length 1, as R2C transforms will give different results due to the handling of the fast axis... Which is surprisingly messy to handle, e.g. when considering arrays where only one axis has a length>1.