Naoki Shibata

Results 223 comments of Naoki Shibata

The values returned by SLEEF functions may differ slightly within the specified error range, even if only a vectorized path is taken.

Thank you! Actually this is the first time that I received a message regarding to the DFT library. 😀 I will look into it.

One thing that is not documented is that you need to allocate (n/2+1)*sizeof(real)*2 bytes of memory when doing a real transform. I am sorry for this.

As for the difference in data placement, what FFT library are you comparing to? SleefDFT is designed to be easy to migrate from FFTW and Ooura FFT.

There is a tester to compare the results of transforms between SleefDFT and FFTW, which is src/dft-tester/fftwtest1d.c. I can check the correctness of your code if you could compare the...

Hello, Those scalar functions are not optimized. They are provided for easy understanding of how the vectorized version of functions work. You can try Sleef_sinf1_u35purecfma instead of Sleef_sinf_u35. Those purecfma...

That means that recent math functions in glibc are pretty fast. SLEEF is a vectorized math library, and it is not meant for scalar computation.

Could you tell me a little bit about the purpose of your code?

Actually I have a plan for this issue, which is to remove sleefdp.c and sleefsp.c, and make the scalar functions aliases to the functions with purecscalar helper.

I am going to introduce a dispatcher to those functions, and they can utilize FMA if available. Then, scalar functions are as fast as vector functions.