runtime icon indicating copy to clipboard operation
runtime copied to clipboard

JIT: Visit switch successors in increasing likelihood order for RPO-based layout

Open amanasifkhalid opened this issue 9 months ago • 4 comments

Follow-up to #101473. Also cleans up the BasicBlock successor visitor API surface a bit by separating logic for visiting successors in increasing likelihood order into BasicBlock::VisitAllSuccsInLikelihoodOrder.

amanasifkhalid avatar May 06 '24 19:05 amanasifkhalid

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.

Here's the diff summary on win-x64, since the RPO-based layout is disabled by default. Diffs aren't that big, and they aren't overwhelmingly in one direction...

Diffs are based on 2,534,676 contexts (987,849 MinOpts, 1,546,827 FullOpts).

MISSED contexts: 2,922 (0.12%)

Overall (-8,298 bytes)
Collection Base size (bytes) Diff size (bytes) PerfScore in Diffs
aspnet.run.windows.x64.checked.mch 39,855,926 +1,073 +0.20%
benchmarks.run.windows.x64.checked.mch 8,710,150 -108 -0.05%
benchmarks.run_pgo.windows.x64.checked.mch 32,332,197 -718 +0.39%
benchmarks.run_tiered.windows.x64.checked.mch 12,340,101 -23 +0.03%
coreclr_tests.run.windows.x64.checked.mch 402,921,293 -325 +0.27%
libraries.crossgen2.windows.x64.checked.mch 45,252,213 -1,591 -0.13%
libraries.pmi.windows.x64.checked.mch 63,617,634 -575 -0.11%
libraries_tests.run.windows.x64.Release.mch 285,052,618 -2,971 +0.79%
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch 137,049,028 -574 -0.14%
realworld.run.windows.x64.checked.mch 13,552,182 -1,625 -0.08%
smoke_tests.nativeaot.windows.x64.checked.mch 5,032,760 -861 -1.60%
FullOpts (-8,298 bytes)
Collection Base size (bytes) Diff size (bytes) PerfScore in Diffs
aspnet.run.windows.x64.checked.mch 27,307,698 +1,073 +0.20%
benchmarks.run.windows.x64.checked.mch 8,709,728 -108 -0.05%
benchmarks.run_pgo.windows.x64.checked.mch 18,519,682 -718 +0.39%
benchmarks.run_tiered.windows.x64.checked.mch 3,077,579 -23 +0.03%
coreclr_tests.run.windows.x64.checked.mch 121,562,176 -325 +0.27%
libraries.crossgen2.windows.x64.checked.mch 45,250,508 -1,591 -0.13%
libraries.pmi.windows.x64.checked.mch 63,504,133 -575 -0.11%
libraries_tests.run.windows.x64.Release.mch 107,104,520 -2,971 +0.79%
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch 126,736,473 -574 -0.14%
realworld.run.windows.x64.checked.mch 13,146,461 -1,625 -0.08%
smoke_tests.nativeaot.windows.x64.checked.mch 5,031,713 -861 -1.60%

amanasifkhalid avatar May 06 '24 19:05 amanasifkhalid

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. If you don't think it's worth churning the RPO layout implementation while running an experiment in the perf lab (especially since the diffs don't look like particularly large wins), I'm happy to undo those changes and just keep the block visitor cleanup work.

amanasifkhalid avatar May 07 '24 17:05 amanasifkhalid

@jakobbotsch this cleans up the BasicBlock visitor API surface a bit. I initially tried something like the initializer pattern you mentioned over in #101473, but the current usage of AllSuccessorEnumerator in fgRunDfs complicated this approach -- what do you think of just passing the useProfile template argument in fgRunDfs to AllSuccessorEnumerator? It introduces some code duplication to the latter's constructor, but it's otherwise a pretty localized change.

amanasifkhalid avatar May 08 '24 22:05 amanasifkhalid