runtime JIT: Visit switch successors in increasing likelihood order for RPO-based layout

Follow-up to #101473. Also cleans up the BasicBlock successor visitor API surface a bit by separating logic for visiting successors in increasing likelihood order into BasicBlock::VisitAllSuccsInLikelihoodOrder.

May 06 '24 19:05 amanasifkhalid

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.

May 06 '24 19:05 dotnet-policy-service[bot]

Here's the diff summary on win-x64, since the RPO-based layout is disabled by default. Diffs aren't that big, and they aren't overwhelmingly in one direction...

Diffs are based on 2,534,676 contexts (987,849 MinOpts, 1,546,827 FullOpts).

MISSED contexts: 2,922 (0.12%)

Overall (-8,298 bytes)

Collection	Base size (bytes)	Diff size (bytes)	PerfScore in Diffs
aspnet.run.windows.x64.checked.mch	39,855,926	+1,073	+0.20%
benchmarks.run.windows.x64.checked.mch	8,710,150	-108	-0.05%
benchmarks.run_pgo.windows.x64.checked.mch	32,332,197	-718	+0.39%
benchmarks.run_tiered.windows.x64.checked.mch	12,340,101	-23	+0.03%
coreclr_tests.run.windows.x64.checked.mch	402,921,293	-325	+0.27%
libraries.crossgen2.windows.x64.checked.mch	45,252,213	-1,591	-0.13%
libraries.pmi.windows.x64.checked.mch	63,617,634	-575	-0.11%
libraries_tests.run.windows.x64.Release.mch	285,052,618	-2,971	+0.79%
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch	137,049,028	-574	-0.14%
realworld.run.windows.x64.checked.mch	13,552,182	-1,625	-0.08%
smoke_tests.nativeaot.windows.x64.checked.mch	5,032,760	-861	-1.60%

FullOpts (-8,298 bytes)

Collection	Base size (bytes)	Diff size (bytes)	PerfScore in Diffs
aspnet.run.windows.x64.checked.mch	27,307,698	+1,073	+0.20%
benchmarks.run.windows.x64.checked.mch	8,709,728	-108	-0.05%
benchmarks.run_pgo.windows.x64.checked.mch	18,519,682	-718	+0.39%
benchmarks.run_tiered.windows.x64.checked.mch	3,077,579	-23	+0.03%
coreclr_tests.run.windows.x64.checked.mch	121,562,176	-325	+0.27%
libraries.crossgen2.windows.x64.checked.mch	45,250,508	-1,591	-0.13%
libraries.pmi.windows.x64.checked.mch	63,504,133	-575	-0.11%
libraries_tests.run.windows.x64.Release.mch	107,104,520	-2,971	+0.79%
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch	126,736,473	-574	-0.14%
realworld.run.windows.x64.checked.mch	13,146,461	-1,625	-0.08%
smoke_tests.nativeaot.windows.x64.checked.mch	5,031,713	-861	-1.60%

May 06 '24 19:05 amanasifkhalid

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. If you don't think it's worth churning the RPO layout implementation while running an experiment in the perf lab (especially since the diffs don't look like particularly large wins), I'm happy to undo those changes and just keep the block visitor cleanup work.

May 07 '24 17:05 amanasifkhalid

@jakobbotsch this cleans up the BasicBlock visitor API surface a bit. I initially tried something like the initializer pattern you mentioned over in #101473, but the current usage of AllSuccessorEnumerator in fgRunDfs complicated this approach -- what do you think of just passing the useProfile template argument in fgRunDfs to AllSuccessorEnumerator? It introduces some code duplication to the latter's constructor, but it's otherwise a pretty localized change.

May 08 '24 22:05 amanasifkhalid

runtime runtime copied to clipboard

JIT: Visit switch successors in increasing likelihood order for RPO-based layout

runtime
runtime copied to clipboard