rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

`num::WrappingFrom` trait for conversions between integers

Open scottmcm opened this issue 1 year ago • 21 comments

We have From for infallible, TryFrom for checked, and this proposes WrappingFrom for modular conversions.

Rendered

scottmcm avatar Oct 01 '24 05:10 scottmcm

An alternative is to have it as inherent methods. That solves the problems with as but not generics.

ChayimFriedman2 avatar Oct 01 '24 05:10 ChayimFriedman2

A few thoughts:

  1. Why not From<Wrapping<T>> for Wrapping<U>?
  2. What about NonZero types?

clarfonthey avatar Oct 01 '24 05:10 clarfonthey

I look forward to being able to write fewer as casts, and this seems like a well-defined chunk of them to carve off. Thoughts:

  • The RFC does not discuss the tradeoff between being a From-shaped trait (Self = destination type) and an Into-shaped trait (Self = source type). I think From is probably more appropriate (because Into is frequently ambiguous in numeric expressions), but it should be discussed explicitly.
  • Should a trait with exactly this signature be added to num-traits (or another published library) so that the new design can be tested out in stable-targeting code? The RFC mentions num_traits::FromPrimitive, but that is a fallible conversion, which is very different.
  • The trait documentation doesn't say a lot that actually constrains what it can be expected to do, unless you know exactly what “quantized numeric lattice” means. The last paragraph is most concrete and specifically giving a property that should hold, but it is for “integer”s so that's just an example, not a “should”.
    • What are examples of non-integer types that should implement this trait, and what properties should they have?
    • What are examples of non-integer types that should not implement this trait (because they don't have any operation that is a wrapping conversion in this sense)?

kpreid avatar Oct 01 '24 20:10 kpreid

  • What are examples of non-integer types that should implement this trait, and what properties should they have?

The only examples I can think of are not exactly well known (e.g. Galois Fields):

impl WrappingFrom<GaloisField<9>> for GaloisField<3> {
    ...
}

though a better example might be:

impl WrappingFrom<Angle> for PointOnUnitCircle { // both 0 and 2 pi radians convert to the same point
   ...
}

programmerjake avatar Oct 01 '24 20:10 programmerjake

  • What are examples of non-integer types that should not implement this trait (because they don't have any operation that is a wrapping conversion in this sense)?

that part's easier:

  • impl !WrappingFrom<u8> for ! { ... } since you can't make a value of type !
  • impl !WrappingFrom<String> for char { ... } since it doesn't have a consistent definition and "" contains no chars anyway
  • impl !WrappingFrom<f64> for f32 { ... } -- conversion isn't wrapping...

programmerjake avatar Oct 01 '24 20:10 programmerjake

The RFC does not discuss the tradeoff between being a From-shaped trait (Self = destination type) and an Into-shaped trait (Self = source type). I think From is probably more appropriate (because Into is frequently ambiguous in numeric expressions), but it should be discussed explicitly.

I think that having a WrappingInto which depends on WrappingFrom might be nice, so you could use a method-call version. It's unlikely you'd want to actually implement this, like with From and Into, but it feels valuable IMHO.

clarfonthey avatar Oct 01 '24 22:10 clarfonthey

Thanks for writing a more focussed RFC @scottmcm.

I agree that something like this should exist, both for usage in generics, and with an eventual goal of making as casts obsolete (or just Clippy-linted for anything that's not a pointer cast).

Not distinguishing between truncating and extending casts is a minor flaw, but likely better than attempting to distinguish these cases with separate traits.

No reflexive impl

My take is that if A: WrappingFrom<B> then we should have reflexive impls A: WrappingFrom<A> and B: WrappingFrom<B>, however it would be preferable to make this a soft expectation on implementations, and not use a blanket impl. Partly because non-numeric types shouldn't get this impl, but mostly because Rust does not (yet) have specialization or any other story for overlapping blanket impls.

  1. Why not From<Wrapping<T>> for Wrapping<U>?

Because From never truncates. And because it would be a pain to use (having to unwrap the target U type). At one point we used Wrapping in rand code; now we mostly use wrapping_add, wrapping_mul etc.

dhardy avatar Oct 02 '24 08:10 dhardy

Not distinguishing between truncating and extending casts is a minor flaw, but likely better than attempting to distinguish these cases with separate traits.

One thing I've been thinking here is that From is an extending cast. So if you want a "definitely a truncation" function you could write something like where A: WrappingFrom<B>, B: From<A>.

But overall it's not obvious to me in which contexts you'd want to require a truncation, and trying to write a trait for truncation specifically reopens a bunch of hard problems about usize that WrappingFrom avoids.

scottmcm avatar Oct 03 '24 17:10 scottmcm

But overall it's not obvious to me in which contexts you'd want to require a truncation,

Here’s one I can think of: If you have an integer that’s uniformly distributed across the type’s entire range (e.g. a hash), then truncating it produces a uniformly distributed value of the smaller type. If you convert it to a larger type, then now you have added zero bits, so it no longer covers the entire range.

I don't know of any specific applications that want this property and aren't non-generic or better served by different tools, though.

kpreid avatar Oct 03 '24 17:10 kpreid

I like this proposal, but I think there is a more generic need for lossy conversions than just integers, for example you could imagine a trait

trait FromLossy<T> {
     fn from_lossy(t: &T) -> Self;
}

that converts from a wider set into a narrower one, but infallibly, like a "best effort" From. This would work for integers, yes, but also from larger floats to smaller ones, from bytes to utf8 strings, and I'm sure a whole host of other places where we currently use inherent methods.

It would be really useful to be able to, for example, do

let s: String = b"hello\xffworld".into_lossy();

FHTMitchell avatar Oct 04 '24 10:10 FHTMitchell

@FHTMitchell while that may be true, it is a very different sort of conversion to integer wrapping conversions, so I don't think here is a good place to discuss it. Feel free to go back to https://github.com/rust-lang/rfcs/pull/2484 (or possibly start a new, focussed, RFC).

dhardy avatar Oct 04 '24 14:10 dhardy

Yeah, I definitely don't like the idea of combining float and integer conversions into one trait, since for floats, the result is actual rounding, whereas for integers, your result could be totally different from the original number. That's the main purpose of having a dedicated WrappingFrom trait: it's clear what it does, and you're not overburdening the concept of a "lossy conversion" between different methods.

clarfonthey avatar Oct 04 '24 17:10 clarfonthey

In particular, as the #2484 discussion shows, it's very hard to define a useful generic concept of "lossy" that isn't just "meh, it does something, good luck" -- because you need a precise definition for generic code to make any sense.

And, for example, f64f32 fits best with "lossy" being something like "the closest representable value in the target type", but that would mean 300_u16255_u8, which is very much not what <u8 as WrappingFrom<u16>>::wrapping_from is proposed to do here.

So as others have said, this is one particular type of conversion. If there are others, those will be other RFCs.

scottmcm avatar Oct 04 '24 18:10 scottmcm

Also prior art: https://github.com/rust-lang/rfcs/pull/3415

Summary of the discussion:

  • People like to bikeshed method names

  • wrap is probably a better name than truncate

  • If as for lossy numeric casts is deprecated, some people are concerned about verbosity and want shorter method names, and many are concerned about ecosystem churn

  • Deprecating integer to float casts is harder to justify (they don't wrap or saturate; the worst that can happen is a precision loss that is expected when dealing with floats). The other direction (float to integer) is more controversial.

  • There was no consensus on whether the TruncatingFrom and SaturatingFrom traits should be implemented for lossless conversions (e.g. u8 -> u16) as well.

Aloso avatar Oct 04 '24 21:10 Aloso

an idea for {int}: WrappingFrom<{float}>: that's how you spell JavaScript-style float -> int conversions which are basically:

if input.is_finite() {
    // can be done without actually needing an intermediate bigint or allocation
    Output::wrapping_from(input.truncate().to_bigint())
} else {
    0
}

programmerjake avatar Oct 04 '24 21:10 programmerjake

{int}: WrappingFrom<{float}> would be confusing and not very useful.

  • The method name wrapping_from doesn't fit since it doesn't just wrap, but also rounds towards zero, and turns ±Infinity and NaN into 0
  • The behavior of {float} as {int} is to saturate (rather than wrap) large numbers, which I think is more useful.

Aloso avatar Oct 05 '24 00:10 Aloso

{int}: WrappingFrom<{float}> would be confusing and not very useful.

  • The method name wrapping_from doesn't fit since it doesn't just wrap, but also rounds towards zero, and turns ±Infinity and NaN into 0

if you think of wrapping as returning just the bits that fit in the target integer, JavaScript-style semantics are reasonable as wrapping float -> int: e.g. 78187493530.73755f64 which in hexadecimal is 0x12_3456_789A.BCD0, the wrapping conversion extracts just the bits that fit giving 0x3456_789Au32

though this only works for negative numbers if you think of treating the float as being in sign-magnitude and extracting the bits as an unsigned integer, then negating the result if the float was negative (kinda like sign-magnitude to twos complement conversion). Infinities are treated as being 0x1...00000 with infinitely many zeros. NaN converting to zero matches Rust float -> int conversion.

  • The behavior of {float} as {int} is to saturate (rather than wrap) large numbers, which I think is more useful.

something else being more useful doesn't mean wrapping can't also be useful...

programmerjake avatar Oct 05 '24 00:10 programmerjake

This talks about From<Wrapping<T>> but I think impl WrappingFrom<T> for U is more akin to impl From<T> for Wrapping<U>. For example in impl WrappingFrom<u32> for u16 we don't care that the u32 itself might wrap, but the output u16 has the property that adding 1 << 16 is a noop.

ronnodas avatar Oct 07 '24 11:10 ronnodas

but the output u16 has the property that adding u16::MAX is a noop.

I think you mean adding 0x10000 since u16::MAX == 0xFFFF and adding that isn't a noop when wrapping, it's equivalent to subtracting 1.

programmerjake avatar Oct 07 '24 17:10 programmerjake

I see i32::max as u32 as strange rather than evil, and I don't see how this proposal changes it. i32::max as u32 will still provide the integral address of a function after this proposal is implemented. This proposal does not give an alternative way to express this. I find this example confusing as motivation for the proposal.

On the other hand, it is not clear from the proposal how signed to unsigned conversion (and vice versa) is handled. Will WrappingFrom<i32> for u32 exist? And what does it do? From the name I would guess that it transmutes the representation such that it provides the two's complement. But from the description I could as easily think that it truncates to 31 bits. Or that it is not implemented at all.

My understanding is that WrappingFrom<T> for U is implemented for the primitive integral types where sizeof<T> >= sizeof<U> and takes sizeof<U> bytes from the lowest bytes of T and transmutes them to U. But I cannot recreate this from the definitions in the reference section. While this may be true for the primitive types, I expect that this should be specified in mathematical terminology so that types with other representations give similar results.

jongiddy avatar Oct 10 '24 08:10 jongiddy

My understanding is that WrappingFrom<T> for U is implemented for the primitive integral types where sizeof<T> >= sizeof<U>

afaict WrappingFrom<T> for U is implemented for all combinations of integer types regardless of if it's a no-op, just changing type, truncating, sign-extending, or zero-extending.

WrappingFrom<T> for U for an input value $t$ of type T gives the unique value $u$ of type U where $u = t + n \times 2^W$ for the integer $n$ where $u$ is in range and $W$ is the bit-width of U -- so it does the exact same thing as t as U.

programmerjake avatar Oct 10 '24 08:10 programmerjake