rfcs Anonymous sum types

Issue by glaebhoerl Saturday Aug 03, 2013 at 23:58 GMT

For earlier discussion, see https://github.com/rust-lang/rust/issues/8277

This issue was labelled with: B-RFC in the Rust repository

Rust has an anonymous form of product types (structs), namely tuples, but not sum types (enums). One reason is that it's not obvious what syntax they could use, especially their variants. The first variant of an anonymous sum type with three variants needs to be syntactically distinct not just from the second and third variant of the same type, but also from the first variant of all other anonymous sum types with different numbers of variants.

Here's an idea I think is decent:

A type would look like this: (~str|int|int). In other words, very similar to a tuple, but with pipes instead of commas (signifying or instead of and).

A value would have the same shape (also like tuples), with a value of appropriate type in one of the "slots" and nothing in the rest:

let foo: (~str|int|int) = (!|!|666);
match foo {
    (s|!|!) => println(fmt!("string in first: %?", s)),
    (!|n|!) => println(fmt!("int in second: %?", n)),
    (!|!|m) => println(fmt!("int in third: %?", m))
}

(Nothing is a bikeshed, other possible colors for it include whitespace, ., and -. _ means something is there we're just not giving it a name, so it's not suitable for "nothing is there". ! has nothing-connotations from the negation operator and the return type of functions that don't.)

I'm not sure whether this conflicts syntax-wise with closures and/or negation.

Another necessary condition for this should be demand for it. This ticket is to keep a record of the idea, in case someone else has demand but not syntax. (If the Bikesheds section of the wiki is a better place, I'm happy to move it.)

# let three = `Int 3;;
val three : [> `Int of int ] = `Int 3
# let four = `Float 4.;;
val four : [> `Float of float ] = `Float 4.
# let nan = `Not_a_number;;
val nan : [> `Not_a_number ] = `Not_a_number
# let list = [three; four; nan];;
val list  : [> `Float of float | `Int of int | `Not_a_number ] list

The val lines are the types of the let assignments, left in to see how the typing works.

In the back-end at assembly time the names are given a globally unique integer (in the current implementation it is via hashing, a chance of collision but overall the chance is extremely low as well as warnings can be put in place to catch them), however I've seen talk of making a global registry so they just get incremented on first access efficiently.

A plain Polymorphic Variant with no data is represented internally as an integer:

`Blah

Becomes the integer 737303889 (yes I checked), and comparing those are trivial. For Polymorphic variants that can hold data (either a single element or a tuple of elements) such as:

`Blah (42, 6.28)

Gets encoded internally as an array of two fields in assembly, the first is the above number as before, the second is the pointer to the data of the tuple (although in most cases these all get inlined into the same memory in OCaml due to inlining and optimization passes). In the typing system the above would be [> Blah of int * float ](in OCaml the types of a tuple are separated by*`).

However, about Polymorphic variants is that they can be opened or closed. Any system can pass any of them that they want, including passing through if you want. For example, a simple way to handle something like a generic event in OCaml would be like:

let f next x = match x with
  | `Blah x -> do_something_with x
  | `Foobar -> do_something_else ()
  | unhandled -> next unhandled

Which is entirely type safe, dependent on what each function handles down the chain and all.

The big thing on the typing system is that things can be open or close typed, I.E. they either accept any amount of Polymorphic Variants or a closed set of Polymorphic Variants. If something like anonymous sum type here were to be accepted then that concept would be exceedingly useful while being very easy and very fast to statically type.

Feb 17 '17 23:02 OvermindDL1

Anonymous sum types might interact with -> impl Trait : At present, this code snippet cannot compile because the iterators have different types :

match x {
    A(a) => once(a).chain(foo),
    B(b) => once(bar).chain(foo).chain(b),
}

You could make this make sense with an anonymous sum type of the form impl Iterator | impl Iterator, that itself becomes an Iterator, but inferring any type like that sounds like chaos.

One could do it in std with enums like :

enum TwoIterators<A,B> {
    IterA(A),
    IterB(B),
}

impl Iterator for TwoIterators where .. { .. }

so the above code becomes

match x {
    A(a) => TwoIterators::IterA( once(a).chain(foo) ),
    B(b) => TwoIterators::IterB( once(bar).chain(foo).chain(b) ),
}

I could imagine some enum Trait sugar that did basically this too. You cannot delegate associated types or constants to an enum at runtime like this, so an enum Trait must enforce that they all agree across all the variants.

Mar 15 '17 20:03 burdges

this might sound like a weird hack , but how about just making A|B sugar for 'Either', i suppose it might get even weirder to start composing A|B|C as Either<A,Either<B,C>> or have that mapping to something . What if there was some sort of general purpose 'operator overloading' in the 'type syntax' , allowing people code to experiment with various possibilities - see what gains traction (i had yet another suggestion about allowing general purpose substitutions, e.g. type Either<A,Either<B,C>> = Any3<A,B,C> .. etc https://www.reddit.com/r/rust/comments/6n53oa/type_substitutions_specialization_idea/ now imagine recovering ~T === Box<T> ~[T] ... type Box<RawSlice<T>> = Vec<T> .. through a completely general purpose means )

Jul 14 '17 09:07 dobkeratops

@dobkeratops I'd rather just have a variant style type, i.e., with variadics.

Jul 15 '17 16:07 strega-nil

I wrote some code that could potentially fit into a library now that type macros are stable: https://gist.github.com/Sgeo/ecee21895815fb2066e3

Would people be interested in this as a crate?

Aug 07 '17 03:08 Sgeo

I've just come upon this issue, while looking for a way to avoid having some gross code that simply doesn't want to go away (actually it's slowly increasing, started at 8 variants and passed by 9 before reaching 12):

use tokio::prelude::*;

pub enum FutIn12<T, E, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12>
where
    F1: Future<Item = T, Error = E>, // ...
{
    Fut1(F1), // ...
}

impl<T, E, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12> Future
    for FutIn12<T, E, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12>
where
    F1: Future<Item = T, Error = E>, // ...
{
    type Item = T;
    type Error = E;

    fn poll(&mut self) -> Result<Async<Self::Item>, Self::Error> {
        use FutIn12::*;
        match *self {
            Fut1(ref mut f) => f.poll(), // ...
        }
    }
}

I was thus thinking that it'd be great to have anonymous sum types that automatically derived the traits shared by all their variants, so that I could get rid of this code and just have my -> impl Future<Item = (), Error = ()> function return the futures in its various branches (with some syntax, that ideally wouldn't expose the total number of variants but let rustc infer it from the returning points, so that adding a branch doesn't require changing all the other return points), and have the anonymous sum type match the -> impl Future return type.

Apr 22 '18 15:04 Ekleog

As I wrote here I think this use case would be better addressed by something modelled after how closures work.

Apr 22 '18 16:04 glaebhoerl

I don’t think it would be wise to make anonymous sum types nominally typed, as you seem to suggest. Structural typing, as with tuples, is far more useful and less surprising to the programmer.

Apr 22 '18 16:04 alexreg

@alexreg What they're saying is that the specific use-case of wanting to return impl Trait with different types in each branch is better handled by a secret nominal type, similar to how closures are implemented.

Therefore, anonymous sum types are separate (and mostly unrelated) from that use case.

Apr 22 '18 16:04 Pauan

@Pauan Oh, well I agree with that. As long as we consider these things two separate features, fine. Thanks for clarifying.

Apr 22 '18 17:04 alexreg

Oh indeed good point, thanks! Just opened #2414 to track this separate feature, as I wasn't able to find any other open issue|RFC for it :)

Apr 23 '18 01:04 Ekleog

I'm planning to get out a pull request for this proposed RFC. Most of you following this thread probably know that a number of proposals like this were rejected for being too complex, so its focus is minimalism and implementation simplicity rather than ergonomics and features. Any words before I get it out? (I've asked this question in multiple other areas to try to collect as much feedback before getting the proposed RFC out, fyi)

https://internals.rust-lang.org/t/pre-rfc-anonymous-variant-types/8707/76

Nov 02 '18 18:11 eaglgenes101

I am not sure where the appropriate place is at this point to suggest solutions to this problem, but one thing that was mentioned was interaction with impl Trait. Perhaps an anonymous enum could be created of all returned things so long as they implement some trait. For instance (the ... are left to your imagination):

fn foo() -> Result<(), impl Error> {
..
return Err(fmt::Error...);
...
return Err(io::Error...);
...
return Ok(());
}

This would make an implicit anonymous enum/sum type that implements Error. This would greatly help the current situation with Rust error handling.

Edit: I can also write up a pre-rfc with this if it seems workable.

Nov 15 '19 23:11 vadixidav

@vadixidav Ideas like that have also been floating around for years under names like enum impl Trait. For example:

https://github.com/rust-lang/rfcs/issues/2414
https://internals.rust-lang.org/t/pre-rfc-sum-enums/8782
https://internals.rust-lang.org/t/extending-impl-trait-to-allow-multiple-return-types/7921.

It's generally considered a separate feature proposal, since an enum impl Trait would be something you cannot match on, so there would be no need for any special syntax for the concrete types or their values, but it would only apply to function returns. An "anonymous sum type" is usually taken to mean something that can be created and used anywhere, would be something you explicitly match on, and thus requires adding some special syntax for the concrete types and values.

Nov 16 '19 00:11 Ixrec

@alexreg Got it. I will direct my focus to places where that feature is being proposed instead. Thank you for the pointer.

Nov 16 '19 00:11 vadixidav

I like this feature, this is like TypeScript union types https://www.typescriptlang.org/docs/handbook/unions-and-intersections.html

Will be interesting see auto generated enum on rust, I already like the TypeScript syntax type1 | type2 | ... or enum(type1, type2, ...)

fn add_one(mut value: String | i64) -> String | i64 {
   match value {
       x : String => { 
           x.push_str("1"); 
           x 
       }
       y : i64 => { y + 1 }
   }
}

Jan 16 '21 17:01 Neo-Ciber94

Any update on this ?

May 02 '21 12:05 johannbuscail

Would this also be useful for coalasing errors in Result chains?

trySomething() //Result<A, E1>
.and__then(trySomethingElse) //Result<B, E1|E2>
.and__then(tryYetAnotherThing) //Result<C, E1|E2|E3>

Jul 03 '21 23:07 Luca-spopo

Hey all, I wrote a post about this topic today: https://blog.yoshuawuyts.com/more-enum-types/. In particular I think it's interesting that if we compare structs and enums, it seems enums often take more work to define. Here's the summary table from the post:

	Structs	Enums	Enums Fallback
Named	`struct Foo(.., ..)`	`enum Foo { .., .. }`	-
Anonymous	`(.., ..)`	❌	`either` crate
Type-Erased	`impl Trait`	❌	`auto_enums` crate

Feb 15 '22 09:02 yoshuawuyts

auto_enums

I am working on a library to more or less do what you want, i think. It looks something like this

#[derive(Debug)]
struct Bar;

#[ano_enum]
fn foo() -> ano!(Bar | u8 | u64) {
    Bar
}

#[ano_enum]
fn bar1(foo: ano!(i8 | u8 | Bar)) {
    match ano!(foo) {
        foo::u8(n) => {println!("{}", n + 1)},
        foo::i8(n) => {println!("{}", n)},
        foo::Bar(n) => {println!("{:#?}", n)},
    }
}

Feb 22 '22 12:02 petar-dambovaliev

I like this feature, this is like TypeScript union types https://www.typescriptlang.org/docs/handbook/unions-and-intersections.html

Will be interesting see auto generated enum on rust, I already like the TypeScript syntax type1 | type2 | ... or enum(type1, type2, ...)
fn add_one(mut value: String | i64) -> String | i64 {
   match value {
       x : String => { 
           x.push_str("1"); 
           x 
       }
       y : i64 => { y + 1 }
   }
}

I really like this syntax since it works much like TypeScript. Rust and TS are my main two languages, and union types is something I greatly miss in Rust. This is probably the #1 feature, in my book, which Rust lacks but needs. I hope this makes it into the language sooner than later.

Jul 19 '22 23:07 Keavon

About the comparison with TypeScript union types:

YES! I'm tired of having to guess traits, reading docs, or relying on an IDE, just to say that a fn works correctly for many input-arg types. I wish I could do something like:

const fn gcd(mut a: Int, mut b: Int) -> Int {
    while b != 0 {
        (a, b) = (b, a % b)
    }
    a.abs()
}

Where Int is a named union type comprising all fixed-size integers (signed, unsigned, usize, and isize)

Oct 04 '22 01:10 Rudxain

I suspect most people wouldn't want the enum for that, since they don't want the enum for the return type, but rather they want it to return the type they put in (or maybe the unsigned variant thereof).

Perhaps you're looking for a generic method instead, something like https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=61aa7ed143dbd681b725bc24fbcd7516

use num_traits::*; // 0.2.15
fn gcd<Int: Signed + Copy>(mut a: Int, mut b: Int) -> Int {
    while b != Int::zero() {
        (a, b) = (b, a % b)
    }
    a.abs()
}

Oct 04 '22 02:10 scottmcm

I suspect most people wouldn't want the enum for that, since they don't want the enum for the return type, but rather they want it to return the type they put in (or maybe the unsigned variant thereof).

True. But what I suggest isn't to return an enum per-se, but to return the primitive value directly, regardless of the type (as long as it is constrained).

Perhaps you're looking for a generic method instead, something like https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=61aa7ed143dbd681b725bc24fbcd7516

Thank you a lot! But I wish it was possible to specify types in fn signatures, without any kind of trait-constraints at all, using the types themselves as constraints, like so:

//custom keyword
typedef uint {
    u8, u16, u32, u64, u128, usize
}

const fn gcd(mut a: uint, mut b: uint) -> uint {
    while b != 0 {
        (a, b) = (b, a % b)
    }
    a
}

This way, we could define custom union types that "contain" (I couldn't think of a better term) arbitrary types, as long as the compiler proves that they are "compatible"

Oct 04 '22 02:10 Rudxain

rfcs
rfcs copied to clipboard

Anonymous sum types

SEE ALSO

rfcs rfcs copied to clipboard

Anonymous sum types

SEE ALSO

rfcs
rfcs copied to clipboard