cufinufft Implementing 1.25 upsampling factor with precomputed Horner kernel

Hi all,

Thank you very much for everyone's work on this library. What would be required to implement the upsampling ratio sigma of 1.25 in a similar way to the cpu finufft version? From what I understand, in the case of finufft, the values for both 1.25 and 2.0 are precomputed while they are only precomputed for 2.0 with cufinufft.

At first glance there would be src/cuspreadinterp.h to change, but I don't know if any further modifications would be necessary.

Mar 23 '22 16:03 AaronGhost

Hi Aaron, You are right. Are you running out of RAM for the FFTs? (that would be a good reason to implement this!) It would be very useful if you could try this out in a draft PR. I think merely updating the code you mention, and github.com/flatironinstitute/cufinufft/blob/master/contrib/spreadinterp.cpp to match the functions in FINUFFT's https://github.com/flatironinstitute/finufft/blob/master/src/spreadinterp.cpp and maybe adding a flag/switch in the tester routines, should be enough. I don't think it is too hard.

I hope @MelodyShih will also chime in about what would need to be changed. She is finishing PhD so is quite busy.

She might remember if there's some reason we didn't do this. (Maybe since the spreading kernels are larger for upsampfac=1.25 it doens't help much on GPU side?)

I may be able to help if you get stuck. Best, Alex

Mar 23 '22 21:03 ahbarnett

Hi Alex,

I have been experimenting a bit with FINUFFT and CUFINUFFT on an MRI reconstruction and changing the upsampling factor to 1.25 leads to faster reconstruction, so I am wondering if it would have the same effects with CUFINUFFT. I don't know if there are further modifications to do here: https://github.com/flatironinstitute/finufft/blob/master/src/spreadinterp.cpp

I am not quite sure to understand how the testing routines work, so I don't know where the flag would be pertinent.

Mar 24 '22 18:03 AaronGhost

Hi, I think the changes you made should be enough. Other places to look for is the check here https://github.com/flatironinstitute/cufinufft/blob/master/src/2d/spread2d_wrapper.cu#L661 and https://github.com/flatironinstitute/cufinufft/blob/master/src/3d/spread3d_wrapper.cu#L1181, we might need to find out a bin size that works for all possible tolerance (or, restrict the case of using smaller upsampling factor?) eg. for single precision, if my calculation is correct, the largest ns is 12 (eps = 6e-8), then for 3D problems, the check will fail if using the current default bin size (16,16,2): (16+12)x(16+12)x(2+12)x2x4 = 87808 > 49152; we will be fine for 1D and 2D problems: (16+12)*(16+12)*8 < 49152.

Mar 24 '22 18:03 MelodyShih

This was fixed by https://github.com/flatironinstitute/finufft/pull/488. Closing.

Aug 06 '24 20:08 janden

cufinufft cufinufft copied to clipboard

Implementing 1.25 upsampling factor with precomputed Horner kernel

cufinufft
cufinufft copied to clipboard