finufft
finufft copied to clipboard
Python: Can cufinufft automatically figure out `gpu_device_id`?
Originally reported downstream: https://github.com/flatironinstitute/pytorch-finufft/issues/103
The following will segfault with either a Fatal Python error: aborted
or Fatal Python error: PyThreadState_Get: the function must be called with the GIL held, but the GIL is released (the current Python thread state is NULL)
import numpy as np
import torch
import cufinufft
data = torch.view_as_complex(
torch.stack((torch.randn(15, 80, 12000), torch.randn(15, 80, 12000)), dim=-1)
)
omega = torch.rand(2, 12000) * 2 * np.pi - np.pi
cufinufft.nufft2d1(
*omega.to("cuda:1"),
data.reshape(-1, 12000).to("cuda:1"),
(320,320),
isign=-1,
)
If you change to cuda:0
for both arrays it seems to work fine.
The full error I get is
terminate called after throwing an instance of 'thrust::system::system_error'
what(): exclusive_scan failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
Fatal Python error: Aborted
Current thread 0x000015555552c4c0 (most recent call first):
File "/mnt/home/bward/finufft/finufft/python/cufinufft/cufinufft/_plan.py", line 236 in setpts
File "/mnt/home/bward/finufft/finufft/python/cufinufft/cufinufft/_simple.py", line 38 in _invoke_plan
File "/mnt/home/bward/finufft/finufft/python/cufinufft/cufinufft/_simple.py", line 12 in nufft2d1
File "/mnt/home/bward/finufft/finufft/mwe.py", line 14 in <module>
Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special (total: 20)
Aborted (core dumped)
Does it also break if you specify the device id explicitly in the kwargs? e.g.
cufinufft.nufft2d1(
*omega.to("cuda:1"),
data.reshape(-1, 12000).to("cuda:1"),
(320,320),
isign=-1,
gpu_device_id=1
)
@lu1and10 no, that seems to have fixed it (sorry for not chasing through enough **kwarg
doc to find that option).
So this issue can be re-worded as a feature request: can _compat.py
pick up a reasonable default for gpu_device_id
?
@lu1and10 no, that seems to have fixed it (sorry for not chasing through enough
**kwarg
doc to find that option).So this issue can be re-worded as a feature request: can
_compat.py
pick up a reasonable default forgpu_device_id
?
Yes, I guess so. It will be a nice feature that device can be inferred from inputs.