NFFT.jl
NFFT.jl copied to clipboard
copying of CUDA Plan does not work
Sorry for spamming :laughing:
But copy of a CUDA plan does not work:
p = NFFT.plan_nfft(coords, (size(x,1), size(x,2)))
MethodError: no method matching copy(::CuNFFT.CuNFFTPlan{Float32, 2})
Closest candidates are:
copy(!Matched::LinearAlgebra.Hessenberg{<:Any, <:LinearAlgebra.UpperHessenberg}) at ~/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/share/julia/stdlib/v1.8/LinearAlgebra/src/hessenberg.jl:419
copy(!Matched::LinearAlgebra.Hessenberg{<:Any, <:LinearAlgebra.SymTridiagonal}) at ~/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/share/julia/stdlib/v1.8/LinearAlgebra/src/hessenberg.jl:420
copy(!Matched::LinearAlgebra.Cholesky) at ~/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:511
...
(::Main.var"workspace#17".var"#1#2"{CuNFFT.CuNFFTPlan{Float32, 2}})(::Int64)@none:0
[email protected]:47[inlined]
[email protected]:787[inlined]
f_cuda@[Other: 3](http://localhost:1234/edit?id=96cda76c-d92d-11ed-0c54-bb1f68c531e4#)[inlined]
top-level scope@[Local: 1](http://localhost:1234/edit?id=96cda76c-d92d-11ed-0c54-bb1f68c531e4#)[inlined]
While technically implementing this should be straight forward, I don't think this will work as you hope. Different CuNFFTPlan will not execute in parallel on the GPU. So a threaded for loop will not speed up anything.
Yeah I wanted to give a shot with KernelAbstractions. Surprisingly that worked with Interpolations pretty well, hence I wanted to give it a try.
But yes, looks like it is not working.
So at the moment my plan is not quite working, is it?
What is your general impression of the CUDA performance? 10x faster for big arrays?