rfcs
rfcs copied to clipboard
Anonymous sum types
Issue by glaebhoerl
Saturday Aug 03, 2013 at 23:58 GMT
For earlier discussion, see https://github.com/rust-lang/rust/issues/8277
This issue was labelled with: B-RFC in the Rust repository
Rust has an anonymous form of product types (structs), namely tuples, but not sum types (enums). One reason is that it's not obvious what syntax they could use, especially their variants. The first variant of an anonymous sum type with three variants needs to be syntactically distinct not just from the second and third variant of the same type, but also from the first variant of all other anonymous sum types with different numbers of variants.
Here's an idea I think is decent:
A type would look like this: (~str|int|int)
. In other words, very similar to a tuple, but with pipes instead of commas (signifying or instead of and).
A value would have the same shape (also like tuples), with a value of appropriate type in one of the "slots" and nothing in the rest:
let foo: (~str|int|int) = (!|!|666);
match foo {
(s|!|!) => println(fmt!("string in first: %?", s)),
(!|n|!) => println(fmt!("int in second: %?", n)),
(!|!|m) => println(fmt!("int in third: %?", m))
}
(Nothing is a bikeshed, other possible colors for it include whitespace, .
, and -
. _
means something is there we're just not giving it a name, so it's not suitable for "nothing is there". !
has nothing-connotations from the negation operator and the return type of functions that don't.)
I'm not sure whether this conflicts syntax-wise with closures and/or negation.
Another necessary condition for this should be demand for it. This ticket is to keep a record of the idea, in case someone else has demand but not syntax. (If the Bikesheds section of the wiki is a better place, I'm happy to move it.)
SEE ALSO
- #402
- #514
- #1154
cc #402, #514, #1154
What's the state of this?
Compared to tuples, anonymous enums would become increasingly tedious to use since a match statement would have N^2
pipe (|
) characters. At the expense of type inference, it may be better to go with a syntax like:
let foo: enum(String, int, int) = enum 2(666);
match foo {
enum 0(s) => println!("string in first: {:?}", s),
enum 1(n) => println!("int in second: {:?}", n),
enum 2(m) => println!("int in third: {:?}", m),
}
The syntax would be compatible with a future extension that allows enums to be declared with named choices:
let foo: enum { Red(String), Green(int), Blue(int) } = enum Blue(666);
match foo {
enum Red(s) => println!("string in first: {:?}", s),
enum Green(n) => println!("int in second: {:?}", n),
enum Blue(m) => println!("int in third: {:?}", m),
}
I think the feature would be more useful without allowing matching, just doing trait dispatch. I guess it's a different feature, where T|T
has T
's representation, as opposed to one bit more.
@eddyb I've been putting some thoughts into a feature like that I've posted about it on irlo : https://internals.rust-lang.org/t/pre-rfc-anonymous-enum-which-automatically-implement-forwarding-traits/4806
I'd think an Alloca<Trait>
analog of Box<Trait>
would provide the same functionality as this return enum expr
extension of -> impl Trait
idea, except there is dynamic dispatch in Alloca<Trait>
so optimization suffers.
Passing by, but if you are curious in syntax's then OCaml has anonymous sum types called Polymorphic Variants. Basically they are just a name, like `Blah, which can have optional values. An example of the syntax:
# let three = `Int 3;;
val three : [> `Int of int ] = `Int 3
# let four = `Float 4.;;
val four : [> `Float of float ] = `Float 4.
# let nan = `Not_a_number;;
val nan : [> `Not_a_number ] = `Not_a_number
# let list = [three; four; nan];;
val list : [> `Float of float | `Int of int | `Not_a_number ] list
The val
lines are the types of the let
assignments, left in to see how the typing works.
In the back-end at assembly time the names are given a globally unique integer (in the current implementation it is via hashing, a chance of collision but overall the chance is extremely low as well as warnings can be put in place to catch them), however I've seen talk of making a global registry so they just get incremented on first access efficiently.
A plain Polymorphic Variant with no data is represented internally as an integer:
`Blah
Becomes the integer 737303889
(yes I checked), and comparing those are trivial.
For Polymorphic variants that can hold data (either a single element or a tuple of elements) such as:
`Blah (42, 6.28)
Gets encoded internally as an array of two fields in assembly, the first is the above number as before, the second is the pointer to the data of the tuple (although in most cases these all get inlined into the same memory in OCaml due to inlining and optimization passes). In the typing system the above would be [>
Blah of int * float ](in OCaml the types of a tuple are separated by
*`).
However, about Polymorphic variants is that they can be opened or closed. Any system can pass any of them that they want, including passing through if you want. For example, a simple way to handle something like a generic event in OCaml would be like:
let f next x = match x with
| `Blah x -> do_something_with x
| `Foobar -> do_something_else ()
| unhandled -> next unhandled
Which is entirely type safe, dependent on what each function handles down the chain and all.
The big thing on the typing system is that things can be open or close typed, I.E. they either accept any amount of Polymorphic Variants or a closed set of Polymorphic Variants. If something like anonymous sum type here were to be accepted then that concept would be exceedingly useful while being very easy and very fast to statically type.
Anonymous sum types might interact with -> impl Trait
: At present, this code snippet cannot compile because the iterators have different types :
match x {
A(a) => once(a).chain(foo),
B(b) => once(bar).chain(foo).chain(b),
}
You could make this make sense with an anonymous sum type of the form impl Iterator | impl Iterator
, that itself becomes an Iterator
, but inferring any type like that sounds like chaos.
One could do it in std
with enum
s like :
enum TwoIterators<A,B> {
IterA(A),
IterB(B),
}
impl Iterator for TwoIterators where .. { .. }
so the above code becomes
match x {
A(a) => TwoIterators::IterA( once(a).chain(foo) ),
B(b) => TwoIterators::IterB( once(bar).chain(foo).chain(b) ),
}
I could imagine some enum Trait
sugar that did basically this too. You cannot delegate associated types or constants to an enum
at runtime like this, so an enum Trait
must enforce that they all agree across all the variants.
this might sound like a weird hack , but how about just making A|B sugar for 'Either', i suppose it might get even weirder to start composing A|B|C as Either<A,Either<B,C>> or have that mapping to something . What if there was some sort of general purpose 'operator overloading' in the 'type syntax' , allowing people code to experiment with various possibilities - see what gains traction (i had yet another suggestion about allowing general purpose substitutions, e.g. type Either<A,Either<B,C>> = Any3<A,B,C> .. etc https://www.reddit.com/r/rust/comments/6n53oa/type_substitutions_specialization_idea/ now imagine recovering ~T === Box<T> ~[T] ... type Box<RawSlice<T>> = Vec<T> .. through a completely general purpose means )
@dobkeratops I'd rather just have a variant
style type, i.e., with variadics.
I wrote some code that could potentially fit into a library now that type macros are stable: https://gist.github.com/Sgeo/ecee21895815fb2066e3
Would people be interested in this as a crate?
I've just come upon this issue, while looking for a way to avoid having some gross code that simply doesn't want to go away (actually it's slowly increasing, started at 8 variants and passed by 9 before reaching 12):
use tokio::prelude::*;
pub enum FutIn12<T, E, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12>
where
F1: Future<Item = T, Error = E>, // ...
{
Fut1(F1), // ...
}
impl<T, E, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12> Future
for FutIn12<T, E, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12>
where
F1: Future<Item = T, Error = E>, // ...
{
type Item = T;
type Error = E;
fn poll(&mut self) -> Result<Async<Self::Item>, Self::Error> {
use FutIn12::*;
match *self {
Fut1(ref mut f) => f.poll(), // ...
}
}
}
I was thus thinking that it'd be great to have anonymous sum types that automatically derived the traits shared by all their variants, so that I could get rid of this code and just have my -> impl Future<Item = (), Error = ()>
function return the futures in its various branches (with some syntax, that ideally wouldn't expose the total number of variants but let rustc infer it from the returning points, so that adding a branch doesn't require changing all the other return points), and have the anonymous sum type match the -> impl Future
return type.
As I wrote here I think this use case would be better addressed by something modelled after how closures work.
I don’t think it would be wise to make anonymous sum types nominally typed, as you seem to suggest. Structural typing, as with tuples, is far more useful and less surprising to the programmer.
@alexreg What they're saying is that the specific use-case of wanting to return impl Trait
with different types in each branch is better handled by a secret nominal type, similar to how closures are implemented.
Therefore, anonymous sum types are separate (and mostly unrelated) from that use case.
@Pauan Oh, well I agree with that. As long as we consider these things two separate features, fine. Thanks for clarifying.
Oh indeed good point, thanks! Just opened #2414 to track this separate feature, as I wasn't able to find any other open issue|RFC for it :)
I'm planning to get out a pull request for this proposed RFC. Most of you following this thread probably know that a number of proposals like this were rejected for being too complex, so its focus is minimalism and implementation simplicity rather than ergonomics and features. Any words before I get it out? (I've asked this question in multiple other areas to try to collect as much feedback before getting the proposed RFC out, fyi)
https://internals.rust-lang.org/t/pre-rfc-anonymous-variant-types/8707/76
I am not sure where the appropriate place is at this point to suggest solutions to this problem, but one thing that was mentioned was interaction with impl Trait
. Perhaps an anonymous enum could be created of all returned things so long as they implement some trait. For instance (the ... are left to your imagination):
fn foo() -> Result<(), impl Error> {
..
return Err(fmt::Error...);
...
return Err(io::Error...);
...
return Ok(());
}
This would make an implicit anonymous enum/sum type that implements Error. This would greatly help the current situation with Rust error handling.
Edit: I can also write up a pre-rfc with this if it seems workable.
@vadixidav Ideas like that have also been floating around for years under names like enum impl Trait
. For example:
- https://github.com/rust-lang/rfcs/issues/2414
- https://internals.rust-lang.org/t/pre-rfc-sum-enums/8782
- https://internals.rust-lang.org/t/extending-impl-trait-to-allow-multiple-return-types/7921.
It's generally considered a separate feature proposal, since an enum impl Trait
would be something you cannot match
on, so there would be no need for any special syntax for the concrete types or their values, but it would only apply to function returns. An "anonymous sum type" is usually taken to mean something that can be created and used anywhere, would be something you explicitly match
on, and thus requires adding some special syntax for the concrete types and values.
@alexreg Got it. I will direct my focus to places where that feature is being proposed instead. Thank you for the pointer.
I like this feature, this is like TypeScript union types https://www.typescriptlang.org/docs/handbook/unions-and-intersections.html
Will be interesting see auto generated enum on rust, I already like the TypeScript syntax type1 | type2 | ...
or enum(type1, type2, ...)
fn add_one(mut value: String | i64) -> String | i64 {
match value {
x : String => {
x.push_str("1");
x
}
y : i64 => { y + 1 }
}
}
Any update on this ?
Would this also be useful for coalasing errors in Result chains?
trySomething() //Result<A, E1>
.and__then(trySomethingElse) //Result<B, E1|E2>
.and__then(tryYetAnotherThing) //Result<C, E1|E2|E3>
Hey all, I wrote a post about this topic today: https://blog.yoshuawuyts.com/more-enum-types/. In particular I think it's interesting that if we compare structs and enums, it seems enums often take more work to define. Here's the summary table from the post:
Structs | Enums | Enums Fallback | |
---|---|---|---|
Named | struct Foo(.., ..) |
enum Foo { .., .. } |
- |
Anonymous | (.., ..) |
❌ | either crate |
Type-Erased | impl Trait |
❌ | auto_enums crate |
auto_enums
I am working on a library to more or less do what you want, i think. It looks something like this
#[derive(Debug)]
struct Bar;
#[ano_enum]
fn foo() -> ano!(Bar | u8 | u64) {
Bar
}
#[ano_enum]
fn bar1(foo: ano!(i8 | u8 | Bar)) {
match ano!(foo) {
foo::u8(n) => {println!("{}", n + 1)},
foo::i8(n) => {println!("{}", n)},
foo::Bar(n) => {println!("{:#?}", n)},
}
}
I like this feature, this is like TypeScript union types https://www.typescriptlang.org/docs/handbook/unions-and-intersections.html
Will be interesting see auto generated enum on rust, I already like the TypeScript syntax
type1 | type2 | ...
orenum(type1, type2, ...)
fn add_one(mut value: String | i64) -> String | i64 { match value { x : String => { x.push_str("1"); x } y : i64 => { y + 1 } } }
I really like this syntax since it works much like TypeScript. Rust and TS are my main two languages, and union types is something I greatly miss in Rust. This is probably the #1
feature, in my book, which Rust lacks but needs. I hope this makes it into the language sooner than later.
About the comparison with TypeScript union types:
YES! I'm tired of having to guess traits, reading docs, or relying on an IDE, just to say that a fn
works correctly for many input-arg types. I wish I could do something like:
const fn gcd(mut a: Int, mut b: Int) -> Int {
while b != 0 {
(a, b) = (b, a % b)
}
a.abs()
}
Where Int
is a named union type comprising all fixed-size integers (signed, unsigned, usize
, and isize
)
I suspect most people wouldn't want the enum for that, since they don't want the enum for the return type, but rather they want it to return the type they put in (or maybe the unsigned variant thereof).
Perhaps you're looking for a generic method instead, something like https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=61aa7ed143dbd681b725bc24fbcd7516
use num_traits::*; // 0.2.15
fn gcd<Int: Signed + Copy>(mut a: Int, mut b: Int) -> Int {
while b != Int::zero() {
(a, b) = (b, a % b)
}
a.abs()
}
I suspect most people wouldn't want the enum for that, since they don't want the enum for the return type, but rather they want it to return the type they put in (or maybe the unsigned variant thereof).
True. But what I suggest isn't to return an enum
per-se, but to return the primitive value directly, regardless of the type (as long as it is constrained).
Perhaps you're looking for a generic method instead, something like https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=61aa7ed143dbd681b725bc24fbcd7516
Thank you a lot! But I wish it was possible to specify types in fn
signatures, without any kind of trait-constraints at all, using the types themselves as constraints, like so:
//custom keyword
typedef uint {
u8, u16, u32, u64, u128, usize
}
const fn gcd(mut a: uint, mut b: uint) -> uint {
while b != 0 {
(a, b) = (b, a % b)
}
a
}
This way, we could define custom union types that "contain" (I couldn't think of a better term) arbitrary types, as long as the compiler proves that they are "compatible"