Odin icon indicating copy to clipboard operation
Odin copied to clipboard

An empty unit-type

Open graphitemaster opened this issue 3 years ago • 2 comments
trafficstars

As per our discussion on Discord,

Most programming languages have an empty unit-type. Odin has an empty unit-value which is the {} aggregate. This aggregate has the special property that it is the 0-value of any type, the only problem is Odin does not have a type that represents this empty unit-value itself.

You can see this {} in action here where it always stands in as a zero-value for any type.

x: int = {}; // zero value int
y: f32 = {}; // zero value f32
z: string = {}; // zero value string
w: [2]int = {}; // zero value array (x and y both zero)
q: bool = {}; // zero value bool (i.e false)
r: rawptr = {}; // zero value pointer (i.e nil)

This is a powerful tool as it makes initialization in Odin consistent and trivial. Languages that have an empty unit-value also have the corresponding empty unit-type such that you can use it as a stand-in type in contexts that types are needed but no value is required. There are two places in Odin where this would be useful, map and proc.

Suppose for a moment there was a type named unit

Map

As Odin lacks a set type, it's idiomatic in Odin to use map[T]bool as a set and the corresponding if ok expressions for existence checking, the problem with this is it's storage prohibitive to carry a value type of bool for large sets. Odin, unlike other languages has empty structure types which do have size_of(T) = 0 and align_of(T) = 0 so it is possible to construct a monostate type as just monostate :: struct{} and use this in place of bool for the map, as in map[T]monostate, giving a proper set type. This work around is a bit weird to explain to beginners though and otherwise turns a very clean language into one that has a strange quirk. With the unit type however, we could achieve set as map[T]unit and this would be less confusing.

Procedures

A very convenient feature in many programming languages is having a procedure which returns the empty unit-value of any type such that the result of calling the procedure can be assigned to any variable as a stand-in for the {} which can already be assigned to any value.

Consider a function which reports an error and returns a zero-value

error :: proc(message: string) -> unit {
  fmt.printf("%s\n", message);
  return {}; // returns an empty unit-value of the empty unit-type
}
x :: proc() -> bool {
  if !thing do return error("ahhh");
}
y :: proc() -> Maybe(T) {
  if !thing do return error("ahhh");
}
z :: proc() -> rawptr {
  if !thing do return error("ahhh");
}
xx := x();
yy := y();
zz := z();

This is currently not expressible in Odin but should be. In the case of !thing: xx, yy, and zz should all be zero value initialized, as if one had written {} literally, but typed with the return types of the x, y and z procedures.

What should it be named

There is prior names for this feature in other programming languages which we can draw inspiration from

  • Haskell, Rust, and Elm the unit type is called () and its only value is also (), (i.e 0-tuple representation), we could literally use {} as a 0-aggregate representation, i.e we could have {} also be a type, that way unit :: {} is legal and map[T]{} is legal and x :: proc() -> {} { return {} } is legal
  • ML languages (Ocaml, Standard ML, F#, etc), the unit type is just called unit, which is what I chose in this document, the value is written as () but for our uses we'd write it as {} since that's already the empty unit-value in Odin.
  • Scala calls it Unit, and uses () as the value, again we'd use {} since that's already the empty unit-value in Odin.
  • Lisp the type is named NULL and value is NIL, not to be confused with the NIL type (this is confusing)
  • Python has NoneType and the value is None
  • Swift calls it Void or ()
  • Java calls it Void and the only value is null
  • Go it's struct{} which is similar in Odin, Go has the struct{}{} idiom (as does Odin), but unlike Go, struct{}{} can be assigned to anything in Go as the empty unit-value, where-as Odin treats each struct{} as a distinct type.
  • Kotlin uses Unit and Unit (confusing)
  • C++ has std::monostate but this isn't really a unit-type as it cannot convert. C has no equivalent to this at all by the way.

Implicit conversion

The power of the empty unit-type is it's ability to implicitly-convert to the zero-value of any type, without this functionality it's not actually a unit-type because the unit-value in Odin already converts to the zero-value of any type, if the unit-type did not convert then the unit-value would not be of the unit-type type. This is a crucial aspect, the unit-type must behave exactly like the use of the {} tokens (literally) in Odin source. Where ever those are allowed for initialization is where a procedure returning the empty unit-type must also be legal.

Rationale

I've run into the need for this with Odin several times in my half a year writing Odin, I'm working around the lack of this type in other less efficient manners, for instance, creating monads with compile-time polymorphic procedures, as in:

errorable :: proc($T: typeid) -> proc(string) -> T {
  return proc(message: string) -> T {
    fmt.printf("%s\n", message);
    return {}; // the empty unit-value of T
  };
}

error := errorable(bool);
return error("ahh"); // returns false because bool{}

This certainly "works" but it's non-trivial (bad for beginners), less efficient (code bloat, slower compile times, etc) and it doesn't really solve the problem for map, which I'm still seeing the need to define a monostate type such as struct{} and each one ends up being distinct, e.g error := errorable(struct{}) cannot be assigned as a value to a map[T]struct{} since each empty struct is a distinct type

graphitemaster avatar Apr 16 '22 15:04 graphitemaster