euclidean-distance-transform-3d
euclidean-distance-transform-3d copied to clipboard
Cache Aware Single Core Version
In testing, it seems that about 69% of the time my test volume was bound by memory latency. This would seem to indicate that a cache aware version could be ~2-3x faster.
Not clear if this is possible. You might just end up exchanging latency on the core data for latency in the calculation on range and vertex.
Might be able to make use of __builtin_prefetch for the Z axis (for g++ and clang).