rust icon indicating copy to clipboard operation
rust copied to clipboard

Run a single huge par_body_owners instead of many small ones after each other.

Open oli-obk opened this issue 4 months ago • 13 comments

This improves parallel rustc parallelism by avoiding the bottleneck after each individual par_body_owners (because it needs to wait for queries to finish, so if there is one long running one, a lot of cores will be idle while waiting for the single query).

oli-obk avatar Mar 07 '24 14:03 oli-obk

r? @davidtwco

rustbot has assigned @davidtwco. They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot avatar Mar 07 '24 14:03 rustbot

@bors try @rust-timer queue

oli-obk avatar Mar 07 '24 14:03 oli-obk

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

rust-timer avatar Mar 07 '24 14:03 rust-timer

:hourglass: Trying commit 164473b2ac96cd6dccd0a53bc7a0a99ea05004ce with merge c0e49b5e16f7b562786f83d7702b40f242942aee...

bors avatar Mar 07 '24 14:03 bors

:sunny: Try build successful - checks-actions Build commit: c0e49b5e16f7b562786f83d7702b40f242942aee (c0e49b5e16f7b562786f83d7702b40f242942aee)

bors avatar Mar 07 '24 16:03 bors

Queued c0e49b5e16f7b562786f83d7702b40f242942aee with parent 8c9a75b3238b66592779d6b240dbf78eacefebb8, future comparison URL. There is currently 1 preceding artifact in the queue. It will probably take at least ~1.3 hours until the benchmark run finishes.

rust-timer avatar Mar 07 '24 16:03 rust-timer

Finished benchmarking commit (c0e49b5e16f7b562786f83d7702b40f242942aee): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never @rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.2% [0.2%, 0.3%] 3
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.5% [-0.8%, -0.2%] 5
Improvements ✅
(secondary)
-0.3% [-0.5%, -0.2%] 12
All ❌✅ (primary) -0.2% [-0.8%, 0.3%] 8

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.4% [2.4%, 2.4%] 1
Improvements ✅
(primary)
-1.9% [-2.5%, -0.9%] 8
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.9% [-2.5%, -0.9%] 8

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
4.3% [1.1%, 14.8%] 65
Regressions ❌
(secondary)
12.3% [4.8%, 18.6%] 18
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 4.3% [1.1%, 14.8%] 65

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 648.501s -> 652.39s (0.60%) Artifact size: 172.69 MiB -> 172.63 MiB (-0.03%)

rust-timer avatar Mar 07 '24 17:03 rust-timer

@bors try @rust-timer queue

oli-obk avatar Mar 08 '24 18:03 oli-obk

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

rust-timer avatar Mar 08 '24 18:03 rust-timer

:hourglass: Trying commit c5312694deba3c4c68ec0c0c59a95c41c0c327ed with merge 702bd6de43d33c2a791c1a2f6acccd68f55d3afd...

bors avatar Mar 08 '24 18:03 bors

:sunny: Try build successful - checks-actions Build commit: 702bd6de43d33c2a791c1a2f6acccd68f55d3afd (702bd6de43d33c2a791c1a2f6acccd68f55d3afd)

bors avatar Mar 08 '24 19:03 bors

Queued 702bd6de43d33c2a791c1a2f6acccd68f55d3afd with parent 74acabe9b042ea8c42862ee29aca2a8b7d333644, future comparison URL. There is currently 1 preceding artifact in the queue. It will probably take at least ~1.4 hours until the benchmark run finishes.

rust-timer avatar Mar 08 '24 19:03 rust-timer

Finished benchmarking commit (702bd6de43d33c2a791c1a2f6acccd68f55d3afd): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never @rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.3% [0.3%, 0.3%] 2
Regressions ❌
(secondary)
1.7% [1.7%, 1.7%] 1
Improvements ✅
(primary)
-0.3% [-0.3%, -0.3%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.1% [-0.3%, 0.3%] 3

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.1% [1.1%, 1.1%] 1
Regressions ❌
(secondary)
5.7% [5.1%, 6.2%] 2
Improvements ✅
(primary)
-1.7% [-2.4%, -0.7%] 5
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.2% [-2.4%, 1.1%] 6

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.9% [1.1%, 13.7%] 71
Regressions ❌
(secondary)
11.7% [2.9%, 20.6%] 20
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 3.9% [1.1%, 13.7%] 71

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 648.792s -> 653.885s (0.78%) Artifact size: 172.56 MiB -> 172.55 MiB (-0.01%)

rust-timer avatar Mar 08 '24 20:03 rust-timer

@bors r=davidtwco

oli-obk avatar Mar 11 '24 11:03 oli-obk

:pushpin: Commit 55ea94402b5c57a8c937a0efdd4b86e3521dcd7f has been approved by davidtwco

It is now in the queue for this repository.

bors avatar Mar 11 '24 11:03 bors

:hourglass: Testing commit 55ea94402b5c57a8c937a0efdd4b86e3521dcd7f with merge 65cd843ae06ad00123c131a431ed5304e4cd577a...

bors avatar Mar 11 '24 14:03 bors

:sunny: Test successful - checks-actions Approved by: davidtwco Pushing 65cd843ae06ad00123c131a431ed5304e4cd577a to master...

bors avatar Mar 11 '24 16:03 bors

Finished benchmarking commit (65cd843ae06ad00123c131a431ed5304e4cd577a): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.7% [1.7%, 1.7%] 1
Improvements ✅
(primary)
-0.4% [-0.6%, -0.3%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.4% [-0.6%, -0.3%] 2

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.8% [-2.3%, -0.9%] 4
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.8% [-2.3%, -0.9%] 4

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
4.2% [1.3%, 13.8%] 68
Regressions ❌
(secondary)
12.0% [2.1%, 20.9%] 19
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-5.9% [-9.0%, -3.6%] 8
All ❌✅ (primary) 4.2% [1.3%, 13.8%] 68

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 647.596s -> 652.28s (0.72%) Artifact size: 310.01 MiB -> 309.94 MiB (-0.02%)

rust-timer avatar Mar 11 '24 19:03 rust-timer

Umm, did this help for the parallel frontend (and how much?). Otherwise, this looks like a severe regression for the single-threaded version.

Kobzol avatar Mar 11 '24 19:03 Kobzol

(for those that got confused, the linked perf report is for the cycles count rather than the instructions count)

bjorn3 avatar Mar 11 '24 19:03 bjorn3

Ah yes, sorry, forgot to mention that. The same regression is for walltime.

Kobzol avatar Mar 11 '24 19:03 Kobzol

That's super weird. Did I trash CPU caching of the query tables or sth? (Edit: Jup, I did https://perf.rust-lang.org/compare.html?start=d255c6a57c393db6221b1ff700daea478436f1cd&end=65cd843ae06ad00123c131a431ed5304e4cd577a&stat=cache-misses) Before this PR we invoked all queries of the same kind right after each other, now we invoke all queries for the same id together.

Let's revert and revisit.

oli-obk avatar Mar 11 '24 19:03 oli-obk