libs-team icon indicating copy to clipboard operation
libs-team copied to clipboard

Implement (non-short-circuiting) `Sum` for `Saturating<_>`

Open abonander opened this issue 8 months ago • 6 comments

Proposal

Problem statement

The Saturating type does not currently implement Sum (as of the 1.87.0 release) for any numeric types, even though Wrapping does.

Motivating examples or use cases

Summing a list of integers with saturating arithmetic.

This seems like a simple oversight, so I assume the addition of these impls would be relatively uncontroversial.

However, my current use-case is summing the sizes of a binary blob broken up using content-defined chunking:

struct Blob {
    chunks: Vec<ChunkMeta>,
}

// The actual contents of the chunks are transferred separately.
struct ChunkMeta {
    hash: [u8; 32],
    len: u64,
}

impl Blob {
    pub fn total_size(&self) -> u64 {
        self.chunks.map(|chunk| Saturating(chunk.len)).sum().0
    }
}

Summing the size is necessary to enforce a total data size limit since the binary blob is created by a second party, and the metadata is sent separately so it could be trivially forged.

Saturating seems like the appropriate thing to do here since the default behavior could silently wrap in release mode (Wrapping is clearly the wrong solution for the same reason) whereas checked arithmetic would complicate the implementation unnecessarily. In practice, the limit should be far below u64::MAX, so saturating at any value above the limit will result in correct behavior.

Solution sketch

Implement Sum and Sum<T> for Saturating<T> where existing impls for Wrapping are already defined: https://doc.rust-lang.org/stable/src/core/iter/traits/accum.rs.html#95-97

My use-case does not need the Product impl, however we would get it without any additional work thanks to the macro. I assume that's acceptable even though it defies the YAGNI principle.

Edit: after reviewing the discussion in #303, I propose not to short-circuit because I expect arithmetic overflow to be a degenerate case, so performance is a secondary concern. I'm just trying to specify the behavior for if it happens. Given that no one else who has proposed this change apparently cared about short-circuiting, I think letting the discussion in #303 get stuck on this question was letting "perfect" be the enemy of "good" here.

Besides, I think if I wanted a short-circuiting implementation here, I would implement it over checked arithmetic and use .unwrap_or(T::MAX) at the end. That would be significantly clearer in intent.

Alternatives

This can be currently implemented using .fold() (which is what I'm doing), but it's not as elegant:

self.chunks
    .iter()
    // FIXME: `Saturating` does not implement `Sum`
    .fold(Saturating(0), |mut sum, chunk| {
        // `Saturating` only implements `AddAssign<u64>` and not `Add<u64>`,
        // so even this is more annoying than it seems necessary to be.
        sum += chunk.len;
        sum
    })
    .0

However, accepting this as an alternative would seem to defeat the purpose of Iterator::sum(), since the same logic can be applied to all integer arithmetic.

Again though, given that it's already implemented for Wrapping, this seems like a simple oversight.

Links and related work

I was unable to find any existing discussions for Rust with a quick search.

Finding examples in other languages seems superfluous here.

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.

abonander avatar Jun 14 '25 14:06 abonander

Duplicate of #303

pitaj avatar Jun 14 '25 14:06 pitaj

It didn't come up when I searched, but I guess I forgot to check closed issues on this repository specifically.

Regardless, I would propose not to short-circuit since I mostly want to use this method for convenience.

In my specific use-case, I expect saturating to be a degenerate case, so performance is a secondary concern. I'm just trying to define the behavior for if it happens. I also plan to limit the maximum number of chunks since that's another potential exploit.

Given that no one else who has proposed this change apparently cared about short-circuiting, I think closing #303 as not planned was letting perfect be the enemy of good here.

abonander avatar Jun 14 '25 14:06 abonander

Besides, I think if I wanted a short-circuiting implementation here, I would implement it over checked arithmetic and use .unwrap_or(T::MAX) at the end. That would be significantly clearer in intent.

abonander avatar Jun 14 '25 15:06 abonander

This can be currently implemented using .fold() (which is what I'm doing), but it's not as elegant:

Using the Saturating wrapper type makes this a lot clunkier than it has to be. Folding over u64 with u64::saturating_add is simpler. Arguably just as good as .map(Saturating).sum().0. In general I think Saturating and Wrapping very rarely pull their weight.

hanna-kruppe avatar Jun 14 '25 16:06 hanna-kruppe

Ideally I could just do .sum::<Saturating<u64>>().

Moreover, the fact this works (using .sum() in general) for Wrapping but not Saturating is a weird asymmetry.

abonander avatar Jun 14 '25 16:06 abonander

We talked about this in today's @rust-lang/libs-api meeting.

We felt like there was some potentially surprising behavior for signed types (e.g. what is the saturating sum of [i32::MAX, i32::MAX, i32::MIN, i32::MIN]). However, unsigned types don't have that problem.

Given that, we're happy to approve this ACP for unsigned types.

We did find ourselves discussing short-circuiting behavior. We're currently leaning towards having the comments on the methods leave short-circuiting or not intentionally unspecified (e.g. the implementation can do it if it's efficient to do so), with the rationale being that if the user cares one way or the other, they could write a fold over Saturating or Checked themselves.

We can settle that question in the FCP that we'll need for the PR (which will be insta-stable). Meanwhile, please go ahead with a PR adding this for unsigned Saturating types, and mark it as needs-fcp. Thank you!

joshtriplett avatar Jun 24 '25 16:06 joshtriplett