rust Run a single huge par_body_owners instead of many small ones after each other.

This improves parallel rustc parallelism by avoiding the bottleneck after each individual par_body_owners (because it needs to wait for queries to finish, so if there is one long running one, a lot of cores will be idle while waiting for the single query).

Mar 07 '24 14:03 oli-obk

r? @davidtwco

rustbot has assigned @davidtwco. They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Mar 07 '24 14:03 rustbot

@bors try @rust-timer queue

Mar 07 '24 14:03 oli-obk

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

Mar 07 '24 14:03 rust-timer

:hourglass: Trying commit 164473b2ac96cd6dccd0a53bc7a0a99ea05004ce with merge c0e49b5e16f7b562786f83d7702b40f242942aee...

Mar 07 '24 14:03 bors

:sunny: Try build successful - checks-actions Build commit: c0e49b5e16f7b562786f83d7702b40f242942aee (c0e49b5e16f7b562786f83d7702b40f242942aee)

Mar 07 '24 16:03 bors

Queued c0e49b5e16f7b562786f83d7702b40f242942aee with parent 8c9a75b3238b66592779d6b240dbf78eacefebb8, future comparison URL. There is currently 1 preceding artifact in the queue. It will probably take at least ~1.3 hours until the benchmark run finishes.

Mar 07 '24 16:03 rust-timer

Finished benchmarking commit (c0e49b5e16f7b562786f83d7702b40f242942aee): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never @rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.2%, 0.3%]	3
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.5%	[-0.8%, -0.2%]	5
Improvements ✅ (secondary)	-0.3%	[-0.5%, -0.2%]	12
All ❌✅ (primary)	-0.2%	[-0.8%, 0.3%]	8

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.4%	[2.4%, 2.4%]	1
Improvements ✅ (primary)	-1.9%	[-2.5%, -0.9%]	8
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.9%	[-2.5%, -0.9%]	8

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	4.3%	[1.1%, 14.8%]	65
Regressions ❌ (secondary)	12.3%	[4.8%, 18.6%]	18
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	4.3%	[1.1%, 14.8%]	65

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 648.501s -> 652.39s (0.60%) Artifact size: 172.69 MiB -> 172.63 MiB (-0.03%)

Mar 07 '24 17:03 rust-timer

@bors try @rust-timer queue

Mar 08 '24 18:03 oli-obk

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

Mar 08 '24 18:03 rust-timer

:hourglass: Trying commit c5312694deba3c4c68ec0c0c59a95c41c0c327ed with merge 702bd6de43d33c2a791c1a2f6acccd68f55d3afd...

Mar 08 '24 18:03 bors

:sunny: Try build successful - checks-actions Build commit: 702bd6de43d33c2a791c1a2f6acccd68f55d3afd (702bd6de43d33c2a791c1a2f6acccd68f55d3afd)

Mar 08 '24 19:03 bors

Queued 702bd6de43d33c2a791c1a2f6acccd68f55d3afd with parent 74acabe9b042ea8c42862ee29aca2a8b7d333644, future comparison URL. There is currently 1 preceding artifact in the queue. It will probably take at least ~1.4 hours until the benchmark run finishes.

Mar 08 '24 19:03 rust-timer

Finished benchmarking commit (702bd6de43d33c2a791c1a2f6acccd68f55d3afd): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never @rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.3%, 0.3%]	2
Regressions ❌ (secondary)	1.7%	[1.7%, 1.7%]	1
Improvements ✅ (primary)	-0.3%	[-0.3%, -0.3%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.1%	[-0.3%, 0.3%]	3

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.1%	[1.1%, 1.1%]	1
Regressions ❌ (secondary)	5.7%	[5.1%, 6.2%]	2
Improvements ✅ (primary)	-1.7%	[-2.4%, -0.7%]	5
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.2%	[-2.4%, 1.1%]	6

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.9%	[1.1%, 13.7%]	71
Regressions ❌ (secondary)	11.7%	[2.9%, 20.6%]	20
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	3.9%	[1.1%, 13.7%]	71

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 648.792s -> 653.885s (0.78%) Artifact size: 172.56 MiB -> 172.55 MiB (-0.01%)

Mar 08 '24 20:03 rust-timer

@bors r=davidtwco

Mar 11 '24 11:03 oli-obk

:pushpin: Commit 55ea94402b5c57a8c937a0efdd4b86e3521dcd7f has been approved by davidtwco

It is now in the queue for this repository.

Mar 11 '24 11:03 bors

:hourglass: Testing commit 55ea94402b5c57a8c937a0efdd4b86e3521dcd7f with merge 65cd843ae06ad00123c131a431ed5304e4cd577a...

Mar 11 '24 14:03 bors

:sunny: Test successful - checks-actions Approved by: davidtwco Pushing 65cd843ae06ad00123c131a431ed5304e4cd577a to master...

Mar 11 '24 16:03 bors

Finished benchmarking commit (65cd843ae06ad00123c131a431ed5304e4cd577a): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.7%	[1.7%, 1.7%]	1
Improvements ✅ (primary)	-0.4%	[-0.6%, -0.3%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.4%	[-0.6%, -0.3%]	2

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.8%	[-2.3%, -0.9%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.8%	[-2.3%, -0.9%]	4

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	4.2%	[1.3%, 13.8%]	68
Regressions ❌ (secondary)	12.0%	[2.1%, 20.9%]	19
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-5.9%	[-9.0%, -3.6%]	8
All ❌✅ (primary)	4.2%	[1.3%, 13.8%]	68

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 647.596s -> 652.28s (0.72%) Artifact size: 310.01 MiB -> 309.94 MiB (-0.02%)

Mar 11 '24 19:03 rust-timer

Umm, did this help for the parallel frontend (and how much?). Otherwise, this looks like a severe regression for the single-threaded version.

Mar 11 '24 19:03 Kobzol

(for those that got confused, the linked perf report is for the cycles count rather than the instructions count)

Mar 11 '24 19:03 bjorn3

Ah yes, sorry, forgot to mention that. The same regression is for walltime.

Mar 11 '24 19:03 Kobzol

That's super weird. Did I trash CPU caching of the query tables or sth? (Edit: Jup, I did https://perf.rust-lang.org/compare.html?start=d255c6a57c393db6221b1ff700daea478436f1cd&end=65cd843ae06ad00123c131a431ed5304e4cd577a&stat=cache-misses) Before this PR we invoked all queries of the same kind right after each other, now we invoke all queries for the same id together.

Let's revert and revisit.

Mar 11 '24 19:03 oli-obk

rust rust copied to clipboard

Run a single huge par_body_owners instead of many small ones after each other.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Overall result: ❌✅ regressions and improvements - no action needed

rust
rust copied to clipboard