rust icon indicating copy to clipboard operation
rust copied to clipboard

Don't delegate to default implementations of `ExactSizeIterator`

Open yotamofek opened this issue 2 weeks ago • 13 comments

The default ExactSizeIterator calls self.size_hint() and then performs an assertion that the size hint's lower and higher bounds are equal, which is wasteful.

I'm sure the compiler/LLVM can elide that assertion in most cases (e.g. when the size_hint impl returns something like (n, Some(n)), but it still might affect inlining decisions, and maybe even cause LLVM to have to work harder.

There are a lot of changes here, some of which are not as clear-cut as others, so first I wanna see if this has any perf impact.

yotamofek avatar Nov 27 '25 14:11 yotamofek

@bors try @rust-timer queue

yotamofek avatar Nov 27 '25 15:11 yotamofek

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

rust-timer avatar Nov 27 '25 15:11 rust-timer

:hourglass: Trying commit a2e0e2e1c7b3dcedbc5f79a0cc401d477d902a93 with merge ccdc19f7ab8fe127cb8759c4ac5efb5ff7e80b2e…

To cancel the try build, run the command @bors try cancel.

Workflow: https://github.com/rust-lang/rust/actions/runs/19741524094

rust-bors[bot] avatar Nov 27 '25 15:11 rust-bors[bot]

:sunny: Try build successful (CI) Build commit: ccdc19f7ab8fe127cb8759c4ac5efb5ff7e80b2e (ccdc19f7ab8fe127cb8759c4ac5efb5ff7e80b2e, parent: cf8a95590a1b768b7711f2631d5b5e3ead464de7)

rust-bors[bot] avatar Nov 27 '25 18:11 rust-bors[bot]

Queued ccdc19f7ab8fe127cb8759c4ac5efb5ff7e80b2e with parent cf8a95590a1b768b7711f2631d5b5e3ead464de7, future comparison URL. There are currently 0 preceding artifacts in the queue. It will probably take at least ~1.3 hours until the benchmark run finishes.

rust-timer avatar Nov 27 '25 18:11 rust-timer

Finished benchmarking commit (ccdc19f7ab8fe127cb8759c4ac5efb5ff7e80b2e): comparison URL.

Overall result: ❌ regressions - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never @rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.1% [0.1%, 0.1%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (primary -3.3%, secondary -2.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-3.3% [-3.9%, -2.6%] 2
Improvements ✅
(secondary)
-2.4% [-2.4%, -2.4%] 1
All ❌✅ (primary) -3.3% [-3.9%, -2.6%] 2

Cycles

Results (primary -1.2%, secondary -0.9%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.6% [2.6%, 2.6%] 1
Regressions ❌
(secondary)
3.0% [2.6%, 3.4%] 2
Improvements ✅
(primary)
-2.5% [-2.9%, -2.3%] 3
Improvements ✅
(secondary)
-3.5% [-4.3%, -2.2%] 3
All ❌✅ (primary) -1.2% [-2.9%, 2.6%] 4

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 470.335s -> 470.278s (-0.01%) Artifact size: 386.93 MiB -> 386.95 MiB (0.01%)

rust-timer avatar Nov 27 '25 19:11 rust-timer

r? @scottmcm

rustbot has assigned @scottmcm. They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot avatar Nov 27 '25 20:11 rustbot

Putting this up for review.

Perf is mostly neutral (secondary benchmark that regresses looks bimodal?), so not sure at all whether it's worth the effort and complexity that this adds.

OTOH, maybe code outside of rustc uses ExactSizeIterator as an optimization more often than rustc, due to it being stable, unlike TrustedLen. So maybe there's code out there that will benefit from this. I dunno..

I don't mind just abandoning this, or maybe whittling it down to only the very straight forward changes.

yotamofek avatar Nov 27 '25 20:11 yotamofek

Some of these changes have some overlap with https://github.com/rust-lang/rust/pull/146436.

hkBst avatar Nov 28 '25 13:11 hkBst

Some of these changes have some overlap with #146436.

They might conflict if one gets merged before the other, but yours doesn't actually touch the ESI implementations.

yotamofek avatar Nov 28 '25 13:11 yotamofek

:umbrella: The latest upstream changes (presumably #149560) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Dec 03 '25 01:12 bors

FWIW, if you try

#![feature(binary_heap_into_iter_sorted)]
pub fn demo(it: &std::collections::binary_heap::IntoIterSorted<i32>) -> usize {
    let (low, high) = it.size_hint();
    assert_eq!(Some(low), high);
    low
}

in playground with "Show MIR" https://play.rust-lang.org/?version=nightly&mode=release&edition=2024&gist=a074bb1d7524e51ab5cb16746fa66505

Then you'll see that the assert actually optimizes out before we even get to LLVM

    bb0: {
        _0 = copy ((((*_1).0: std::collections::BinaryHeap<i32>).0: std::vec::Vec<i32>).1: usize);
        StorageLive(_6);
        _6 = Le(copy _0, const <i32 as std::mem::SizedTypeProperties>::MAX_SLICE_LEN);
        assume(move _6);
        StorageDead(_6);
        _5 = Option::<usize>::Some(copy _0);
        StorageLive(_2);
        _2 = copy _5;
        // DBG: _3 = &_5;
        // DBG: _4 = &_2;
        StorageDead(_2);
        return;
    }

(It still has some silly leftovers -- probably due to phase ordering -- but there's no assert any more, just the comparison to tell LLVM it's UB to have a too-high length so it can optimize based on that.)

So overall I don't know how much it's worth bothering adding a bunch of extra code in the library.

That said, at the same time the overrides that just call an existing self.len() seem kinda reasonable anyway, so I dunno.


An alternative: this is just a sanity check, so I wonder if we might say "hey, that all-the-time assert is silly" instead, and maybe (in a different PR that would probably need a team nomination) replace the existing

    #[inline]
    #[stable(feature = "rust1", since = "1.0.0")]
    fn len(&self) -> usize {
        let (lower, upper) = self.size_hint();
        // Note: This assertion is overly defensive, but it checks the invariant
        // guaranteed by the trait. If this trait were rust-internal,
        // we could use debug_assert!; assert_eq! will check all Rust user
        // implementations too.
        assert_eq!(upper, Some(lower));
        lower
    }

with

    #[inline]
    #[stable(feature = "rust1", since = "1.0.0")]
    #[rust_inherit_overflow_checks]
    fn len(&self) -> usize {
        let (lower, upper) = self.size_hint();
        // If overflow checks are on, check that the size hint isn't malformed.
        // An incorrect `size_hint` isn't UB, so we don't *have* to check,
        // and thus in default release mode we don't to save time.
        if let Some(upper) = upper {
            _ = upper - lower;
        }
        lower
    }

on the theory that in release mode it's just not ever worth bothering with this check.


Let me know how you're feeling, @rustbot author

scottmcm avatar Dec 06 '25 08:12 scottmcm

Reminder, once the PR becomes ready for a review, use @rustbot ready.

rustbot avatar Dec 06 '25 08:12 rustbot