An empty unit-type

Open graphitemaster opened this issue 3 years ago • 2 comments

trafficstars

As per our discussion on Discord,

Most programming languages have an empty unit-type. Odin has an empty unit-value which is the {} aggregate. This aggregate has the special property that it is the 0-value of any type, the only problem is Odin does not have a type that represents this empty unit-value itself.

You can see this {} in action here where it always stands in as a zero-value for any type.

x: int = {}; // zero value int
y: f32 = {}; // zero value f32
z: string = {}; // zero value string
w: [2]int = {}; // zero value array (x and y both zero)
q: bool = {}; // zero value bool (i.e false)
r: rawptr = {}; // zero value pointer (i.e nil)

This is a powerful tool as it makes initialization in Odin consistent and trivial. Languages that have an empty unit-value also have the corresponding empty unit-type such that you can use it as a stand-in type in contexts that types are needed but no value is required. There are two places in Odin where this would be useful, map and proc.

Suppose for a moment there was a type named unit

Map

As Odin lacks a set type, it's idiomatic in Odin to use map[T]bool as a set and the corresponding if ok expressions for existence checking, the problem with this is it's storage prohibitive to carry a value type of bool for large sets. Odin, unlike other languages has empty structure types which do have size_of(T) = 0 and align_of(T) = 0 so it is possible to construct a monostate type as just monostate :: struct{} and use this in place of bool for the map, as in map[T]monostate, giving a proper set type. This work around is a bit weird to explain to beginners though and otherwise turns a very clean language into one that has a strange quirk. With the unit type however, we could achieve set as map[T]unit and this would be less confusing.

Procedures

A very convenient feature in many programming languages is having a procedure which returns the empty unit-value of any type such that the result of calling the procedure can be assigned to any variable as a stand-in for the {} which can already be assigned to any value.

Consider a function which reports an error and returns a zero-value

error :: proc(message: string) -> unit {
  fmt.printf("%s\n", message);
  return {}; // returns an empty unit-value of the empty unit-type
}
x :: proc() -> bool {
  if !thing do return error("ahhh");
}
y :: proc() -> Maybe(T) {
  if !thing do return error("ahhh");
}
z :: proc() -> rawptr {
  if !thing do return error("ahhh");
}
xx := x();
yy := y();
zz := z();

This is currently not expressible in Odin but should be. In the case of !thing: xx, yy, and zz should all be zero value initialized, as if one had written {} literally, but typed with the return types of the x, y and z procedures.

What should it be named

There is prior names for this feature in other programming languages which we can draw inspiration from

Haskell, Rust, and Elm the unit type is called () and its only value is also (), (i.e 0-tuple representation), we could literally use {} as a 0-aggregate representation, i.e we could have {} also be a type, that way unit :: {} is legal and map[T]{} is legal and x :: proc() -> {} { return {} } is legal
ML languages (Ocaml, Standard ML, F#, etc), the unit type is just called unit, which is what I chose in this document, the value is written as () but for our uses we'd write it as {} since that's already the empty unit-value in Odin.
Scala calls it Unit, and uses () as the value, again we'd use {} since that's already the empty unit-value in Odin.
Lisp the type is named NULL and value is NIL, not to be confused with the NIL type (this is confusing)
Python has NoneType and the value is None
Swift calls it Void or ()
Java calls it Void and the only value is null
Go it's struct{} which is similar in Odin, Go has the struct{}{} idiom (as does Odin), but unlike Go, struct{}{} can be assigned to anything in Go as the empty unit-value, where-as Odin treats each struct{} as a distinct type.
Kotlin uses Unit and Unit (confusing)
C++ has std::monostate but this isn't really a unit-type as it cannot convert. C has no equivalent to this at all by the way.

Implicit conversion

The power of the empty unit-type is it's ability to implicitly-convert to the zero-value of any type, without this functionality it's not actually a unit-type because the unit-value in Odin already converts to the zero-value of any type, if the unit-type did not convert then the unit-value would not be of the unit-type type. This is a crucial aspect, the unit-type must behave exactly like the use of the {} tokens (literally) in Odin source. Where ever those are allowed for initialization is where a procedure returning the empty unit-type must also be legal.

Rationale

I've run into the need for this with Odin several times in my half a year writing Odin, I'm working around the lack of this type in other less efficient manners, for instance, creating monads with compile-time polymorphic procedures, as in:

errorable :: proc($T: typeid) -> proc(string) -> T {
  return proc(message: string) -> T {
    fmt.printf("%s\n", message);
    return {}; // the empty unit-value of T
  };
}

error := errorable(bool);
return error("ahh"); // returns false because bool{}

This certainly "works" but it's non-trivial (bad for beginners), less efficient (code bloat, slower compile times, etc) and it doesn't really solve the problem for map, which I'm still seeing the need to define a monostate type such as struct{} and each one ends up being distinct, e.g error := errorable(struct{}) cannot be assigned as a value to a map[T]struct{} since each empty struct is a distinct type

Apr 16 '22 15:04 graphitemaster

Odin Odin copied to clipboard

An empty unit-type

Map

Procedures

What should it be named

Implicit conversion

Rationale

Odin
Odin copied to clipboard