rust icon indicating copy to clipboard operation
rust copied to clipboard

Redirect `__rust_dealloc` to `sdallocx`

Open Zoxc opened this issue 2 years ago • 10 comments

This could use a perf run.

Zoxc avatar Mar 11 '24 05:03 Zoxc

r? @petrochenkov

rustbot has assigned @petrochenkov. They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot avatar Mar 11 '24 05:03 rustbot

@bors try @rust-timer queue

Kobzol avatar Mar 11 '24 07:03 Kobzol

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

rust-timer avatar Mar 11 '24 07:03 rust-timer

:hourglass: Trying commit 0f657ac3706935687aec5a04e9922ae001fe244f with merge ecb331519998b3890fd9db9720e01106318d4103...

bors avatar Mar 11 '24 07:03 bors

:sunny: Try build successful - checks-actions Build commit: ecb331519998b3890fd9db9720e01106318d4103 (ecb331519998b3890fd9db9720e01106318d4103)

bors avatar Mar 11 '24 08:03 bors

Queued ecb331519998b3890fd9db9720e01106318d4103 with parent a6d93acf5fdeb020ab86cc0d30d5672c23a7dba6, future comparison URL. There are currently 2 preceding artifacts in the queue. It will probably take at least ~2.6 hours until the benchmark run finishes.

rust-timer avatar Mar 11 '24 08:03 rust-timer

Finished benchmarking commit (ecb331519998b3890fd9db9720e01106318d4103): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never @rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.6% [-1.2%, -0.2%] 148
Improvements ✅
(secondary)
-0.6% [-3.8%, -0.3%] 80
All ❌✅ (primary) -0.6% [-1.2%, -0.2%] 148

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.2% [2.2%, 4.6%] 8
Regressions ❌
(secondary)
3.5% [1.8%, 5.0%] 27
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 3.2% [2.2%, 4.6%] 8

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.8% [-1.9%, -1.7%] 2
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 645.581s -> 646.145s (0.09%) Artifact size: 309.97 MiB -> 309.93 MiB (-0.01%)

rust-timer avatar Mar 11 '24 12:03 rust-timer

This was previously tried here, although seemingly in a different way. CC @nnethercote

Kobzol avatar Mar 11 '24 12:03 Kobzol

The PR needs a proper description - what exactly it changes and why it's different from jemalloc overrides below, why it helps, what __rust_dealloc and sdallocx are, etc.

petrochenkov avatar Mar 11 '24 12:03 petrochenkov

sdallocx is the jemalloc function that allows passing a size when deallocating, as an optimization, allowing the allocator to skip the relevant metadata lookup it would otherwise have to do.

workingjubilee avatar Mar 11 '24 15:03 workingjubilee

If https://github.com/rust-lang/rust/pull/122362 works out we could probably just do this change with a #[global_allocator].

Zoxc avatar Mar 12 '24 04:03 Zoxc

I'm going to mark this as blocked on #122362. @Zoxc; Feel free to relabel if you want to make progress on this directly.

@rustbot label -S-waiting-on-author +S-blocked

oskgo avatar Jul 26 '24 05:07 oskgo