celeritas icon indicating copy to clipboard operation
celeritas copied to clipboard

Require a minimum number of volumes to build a BIH tree

Open elliottbiondo opened this issue 7 months ago • 1 comments

Analysis of the tilecal problem confirmed that the BIH is causing a 3x slowdown. Profiling revealed that this is due to increased branching (fewer active threads per warp) which leads to worse memory performance. The tilecal problem itself has a few universes with more than 100 volumes, but many more with fewer than 20. This MR adds a bih_min_size parameter that causes the BIH to only be built for universes with a sufficiently large number of volumes. If a universe has too few volumes, all volumes will be placed into "non-tree volumes," which was previously reserved for volumes with infinite bounding boxes. This results in the following performance on Wildstyle:

Commit Tilecal runtime on V100 (s)
before BIH was merged: 09cbf5ba999 49.5
after BIH was merged: acf500714c27 146
develop: a4c469f7d42f 146
this MR: c37d4d376c3 64.6

Likewise, this MR provides an 88% increase in active threads per warp w.r.t. develop, as well as the following improved memory behavior (also w.r.t. develop):

Screenshot 2025-05-08 at 6 07 53 PM

This MR is a draft for two reasons:

  1. bih_min_size is a tuning parameter. We need to come up with a better system for incorporating these in the code.
  2. It is not clear what to tune this parameter to. Obviously a single test on a V100 is not sufficient. Would the entire suite on an MI250X be sufficient? Can we define and overall FOM for the suite, in the event that performance gains for one problem are accompanied by performance losses in another?

Once we resolve these issues I will update this MR accordingly.

elliottbiondo avatar May 08 '25 22:05 elliottbiondo

Test summary

  466 files    727 suites   23s :stopwatch: 1 169 tests 1 150 :white_check_mark: 14 :zzz:  5 :x: 2 358 runs  2 340 :white_check_mark:  8 :zzz: 10 :x:

For more details on these failures, see this check.

Results for commit c37d4d37.

github-actions[bot] avatar May 08 '25 22:05 github-actions[bot]

Draft until this is refactor to instead set a maximum leaf size

elliottbiondo avatar Jun 03 '25 16:06 elliottbiondo

Closing this MR; the minimum leaf size approach used here is superior: #1930

Determining this parameter will done with benchmark problems on ANL JLSE per https://github.com/celeritas-project/regression/issues/3.

elliottbiondo avatar Sep 23 '25 19:09 elliottbiondo