py4dgeo icon indicating copy to clipboard operation
py4dgeo copied to clipboard

Parallelization of KDTree construction

Open dokempf opened this issue 3 years ago • 1 comments

Performance benchmarking shows that as is, the KDTree construction is the bottleneck of the entire algorithm. A naive approach to parallelization could be to distribute the subtree construction to threads. The very coarse levels do not parallelize well in this algorithm, but a sufficiently deep tree mitigates this disadvantage.

Implementation includes two non-trivial aspects:

  • Review nanoflann's internal data structures w.r.t. thread safety as we are operating outside of nanoflann's communicated thread safetly guarantee. However, we are only using the static version of nanoflann (the dynamic would definitely not be thread-safe).
  • Add a task-based OpenMP parallelization to the recursive construction algorithm.

dokempf avatar Nov 30 '21 14:11 dokempf

Nanoflann v1.5.0 added parallel construction support: https://github.com/jlblancoc/nanoflann/releases/tag/v1.5.0

However, my initial testing was far off the advocated speedup of 3. I got around 20% for medium core counts and the code got slower for large core counts. I am currently hesitant to include that in the code base, at least not without a user interface to control.

dokempf avatar Jun 22 '23 07:06 dokempf