FFTW.NET
FFTW.NET copied to clipboard
best of both world?
Hi Argus
I love your work on the array and their memory alignment. Yet, if I'm not wrong, they are not perfect yet:
- AlignedArray and FftwArray are fast in unmanged world, but slow in managed world because their indexers use several checks (VerifyNotDispoed, VerifyRank) and need heavy logic to read an element
- PinnedArray is not aligned and so the fftw lib won't be optimized.
Could we have the best of both world, but using an PinnedArray2 that we expose real arrays (to managed world) but which would be memory aligned?
As far as I know there is no way to allocate a managed array which is guaranteed to be memory aligned (there certainly wasn't at the time of writing this code).
What is your scenario? If getting/setting elements on the managed side of AlignedArray really is the bottle neck, copying the buffer to a true array (with Buffer.BlockCopy
or the like) might be your best bet.
Of course, if you do find a better solution, I'd be happy to accept a PR.
What do you think of these two articles? It seems there is actually memory alignment in .NET in some cases, including byte arrays:
- https://stackoverflow.com/questions/9741395/alignment-of-arrays-in-net
- https://stackoverflow.com/questions/15501766/are-arrays-in-net-naturally-aligned
Yes, there is, but only to specific boundaries. As far as I know there is no way to get 16-byte alignment, which would be required for fftw.
Hi, Thank you for mentioning 16-byte alignment, I was thinking on the wrong number (8 of course), which is confirmed on http://www.fftw.org/fftw3_doc/SIMD-alignment-and-fftw_005fmalloc.html . Yet, I was wondering: is this 16 byte linked with the use of double (or complex, ie two doubles)? If we were using fftw3f, i.e. working on floats, do we still need 16 byte alignment or 8 ? I hope my question does not sound meaningless... Maybe I should also ask the question on fftw forum.
And indeed you were right about .NET arrays, simply created with new double[xx]
, not always being 16-byte aligned. My tests showed that half of the time, they are not 16-byte aligned but they are always 8-byte aligned.
I think the requirement for 16 byte alignment comes from the use of SIMD in fftw which I think is independent of wheter we use floats or doubles (Im not sure though)
Out of curiosity: Have you measured the performance degradation of accessing elements in AlignedArray vs native .net arrays? (with AlignedArrayDouble
/AlignedArrayComplex
, the generic version should be slower because it uses Marshal.PtrToStructure
, etc.). I guess I would expect performance comparable to that of List<double>
.
Good idea to measure perf impact. On 10^8 iterations (on win x64, ryzen 1700) for float (yes, I changed your lib): Write: array:193 aligned array:4542 (drop to 3576 if removing checks) Read : array:113 aligned array:4079 (drop to 3331 if removing checks) (write is a little longer as my loop adds an extra instruction to avoid compiler stripping useless code). I made these figures fall to 675 and 367 by removing the inheritance from AlignedArray<T> (and so any virtual calls) and by using unsafe areas instead. Idem when using Complex.
On release builds?
Sure! And compiled as x64