language Users want to define union or union-like APIs

Based on conversations with @yjbanov, @leonsenft, @mdebbar.

Currently, the Dart language lacks a way to provide static union or union-like semantics or APIs. Multiple other platforms take different approaches - anything from user-definable union types, algebraic/tagged unions, method overloading, and I'm sure other approaches we missed.

Let's look at two examples:

APIs that take nominal types A or B

void writeLogs(Object stringOrListOfString) {
  if (stringOrListOfString is String) {
    _writeLog(stringOrListOfString);
  } else if (stringOrListOfString is List<String>) {
    stringOrListOfString.forEach(_writeLog);
  } else {
    throw ArgumentError.value(stringOrListOfString, 'Not a String or List<String>');
  }
}

Problems:

No static type safety. The user can pass an Octopus, and only receive an error at runtime:

void main() {
  // No static error.
  // Runtime error: "Instance of 'Octopus': Not a String or List<String>".
  writeLogs(Octopus());
}

Relies on complex TFA for optimizations, which fall apart with dynamic access:

void main() async {
  // Inferred as "dynamic" for one reason or another.
  var x = something.foo().bar();

  // No static error. Even if it succeeds, all code paths are now retained (disables tree-shaking).
  writeLogs(x);
}

Solutions

A clever use can simply just write two functions:

void writeLog(String log) {
   _writeLog(log);
}

void writeLogList(List<String> logs) {
  logs.forEach(_writeLog);
}

... unfortunately, this now means you often need to think of convoluted API names like writeLogList.

Something like user-definable union types:

void writeLog(String | List<String> logOrListOfLogs) {
    if (stringOrListOfString is String) {
    _writeLog(stringOrListOfString);
  } else if (stringOrListOfString is List<String>) {
    stringOrListOfString.forEach(_writeLog);
  } else {
    // Bonus: Can remove this once we have non-nullable types.
    throw ArgumentError.null(logOrListOfLogs);
  }
}

... unfortunately this (a) Can't have different return types, and (b) might have complex side-effects with reified types (i.e. expensive performance reifying and storing writeLog<T>(T | List<T> | Map<T, List<T> | ....), and (c) just looks ugly compared to the rest of the language.

@yjbanov did mention a first-class match or when could help with (c), but not (a) or (b):

void writeLog(String | List<String> logOrListOfLogs) {
  when (logOrListOfLogs) {
    String: {
      _writeLog(logOrListOfLogs);
    }
    List<String>: {
      logOrListOfLogs.forEach(_writeLog);
    }
    Null: {
    // Bonus: Can remove this once we have non-nullable types.
    throw ArgumentError.null(logOrListOfLogs);
    }
  }
}

Something like user-definable method overloads (my preference in this scenario):

void writeLog(String log) {
  _writeLog(log);
}

void writeLog(List<String> logs) {
  logs.forEach(_writeLog);
}

... this solves all of the above concerns. It does not allow dynamic calls, but neither will static extension methods and neither do, say, named constructors or separate methods (used today), so I don't see this as a net negative.

APIs that structural types A or B

@dantup ran into this while defining Microsoft Language Service protocols. Imagine the following JSON:

// success.json
{
  "status": "SUCCESS"
}

// failure.json
{
  "status": "ERROR",
  "reason": "AUTHENTICATION_REQUIRED"
}

Modeling this in Dart is especially difficult:

void main() async {
  Map<String, Object> response = await doThing();
  final status = response['status'] as String;
  if (status == 'SUCCESS') {
    print('Success!');
  } else if (status == 'ERROR') {
    print('Failed: ${response['reason']}');
  }
}

You can write this by hand, of course, but imagine large auto-generated APIs for popular services. At some point you'll drop down to using code generation, and it's difficult to generate a good, static, model for this.

Problems

Let's imagine we get value types or data classes of some form, and let's even assume NNBD to boot.:

data class Response {
  String status;
  String? reason;
}

This works, but like the problems in the nominal types above, you need runtime checks to use the API correctly. This can get very very nasty on giant, popular APIs (like Microsoft's Language Service, but many many others including Google's own):

void main() async {
  var response = await getResponse();
  // Oops; this will never trigger, because we did not capitalize 'ERROR'.
  if (response.status == 'error') {
    print('ERROR!');
    return;
  }
  // Oops; this will print 'Yay: null' because success messages do not have a reason field.
  if (response.status == 'SUCCESS') {
    print('Yay: ${response.reason}');
    return;
  }
}

Solutions

One way this could be solved is having user-definable tagged unions.

TypeScript would model this as:

type Response = IResponseSuccess | IResponseFailure;

interface IResponseSuccess {
  status: "SUCCESS";
}

interface IResponseFailure {
  status: "ERROR";
  reason: string;
}

async function example_1() {
  const response = await getResponse();
  // Static error: "status" must be "SUCCESS" or "ERROR", got "error".
  if (response.status == 'error') {
    console.log('ERROR!');
    return;
  }
}

async function example_2() {
  const response = await getResponse();
  if (response.status == 'ERROR') {
    console.log('ERROR!');
    return;
  }
  // Automatically promotes "response" to "IResponseSuccess"!
  // Static error: "reason" does not exist on "IResponseSuccess".
  console.log('Yay: ', response.reason);
}

Dec 17 '18 23:12 matanlurey

It does not allow dynamic calls

Is that true? Can't dynamic dispatch to writeLog be implemented as a wrapper on top of the two functions? There's will be dispatch cost, of course, but we're talking about dynamic anyway. I don't think you're worried about method dispatch performance at that point. Without overloads you'd have to do type checks anyway, as your void writeLogs(Object stringOrListOfString) demonstrates.

Dec 18 '18 00:12 yjbanov

Can't dynamic dispatch to writeLog be implemented as a wrapper on top of the two functions?

It could. It does mean though, for overloads at least, you do not know the return type. In practice I'm not sure this is worth it. If it was a feature specifically for trying to help migrate existing (non-overloaded) APIs to overload-based ones, I could see value in that.

Dec 18 '18 01:12 matanlurey

In cases where the compiler can't statically determine which overload to call, it could use a union type for the return:

C foo(A a) {}
D foo(B b) {}

void bar(Object obj) {
  var result = foo(obj); // The compiler would infer the type of `result` as `C | D`.
}

Dec 18 '18 01:12 mdebbar

@matanlurey

It does mean though, for overloads at least, you do not know the return type.

What does it mean to know the return type in dynamic dispatch?

Dec 18 '18 17:12 yjbanov

@matanlurey

In practice I'm not sure this is worth it. If it was a feature specifically for trying to help migrate existing (non-overloaded) APIs to overload-based ones, I could see value in that.

I agree. I also do not see a lot of value in dynamic dispatch as of Dart 2. But that's different from saying that overloads do not support dynamic dispatch. They do. The question is whether we want it.

Dec 18 '18 17:12 yjbanov

@mdebbar Right, that's a second type of dispatch. Unless @matanlurey and I misunderstood each other, we were talking about dispatching d.foo(a) where d is dynamic. What you are talking about is when a is dynamic. Both kinds of dispatches need to be decided upon.

Dec 18 '18 17:12 yjbanov

I'm curious if the web platform APIs could provide use cases and example problems which we could add to this request?

Dec 18 '18 17:12 sethladd

https://github.com/Microsoft/TypeScript/blob/master/lib/lib.dom.d.ts is a good source of web platform examples (look for | in that file).

Dec 18 '18 17:12 yjbanov

There are a couple of separate pieces here that I want to try to tease out to understand better. That way we can be more precise about what the actual user need is.

Overloading

This is the ability to have two methods with the same name but different parameter lists. In your example, it's:

void writeLog(String log) {
   _writeLog(log);
}

void writeLog(List<String> logs) {
  logs.forEach(_writeLog);
}

One key question for this is whether overloads should be chosen dynamically or statically. Given:

Object log;
if (isMonday) {
  log = "A string";
} else {
  log = ["Some", "strings"];
}
writeLog(log);

Would you expect this to do the right thing on all days of the week? Or is this a static error because it doesn't know which overload to call at compile-time?

Which answer you choose has profound impact on the design...

Dynamic overloading

If the dispatch does happen at runtime, then you're talking about something like multimethods—runtime dispatch of methods based on the types of their parameters. This is a really cool, powerful feature. It's also very rare in object-oriented languages.

Doing this would let us do things in Dart that few other languages can do, but it could also be fiendishly complex. Consider:

int weird(String s) => 3;
bool weird(List l) => true;

main() {
  var fn = weird;
  Object unknown;
  var o = fn(unknown);
}

What is the static type of fn? What is the static type of o?

Static overloading

This is what C++, Java, C#, etc. do. It's definitely well-explored territory. It solves several real, concrete problems. For example, in Dart, adding a method to a base class may always be a breaking change because some subclass could have a method with the same name but a different signature. In the listed languages, that's much safer. If the signature is compatible, there's no problem. If it isn't, it just becomes a separate overload. The only risk if there's a compatible signature but an incompatible return type.

Static overloading also has a deserved reputation for adding a ton of complexity to the language. It complicates generics and implicit conversions, sometimes leads to exponential performance cliffs during type-checking, and confuses users.

Union types

This is the ability to define a structural type that permits all values of any two given types. That's:

void writeLog(String | List<String> logOrListOfLogs) {
    if (stringOrListOfString is String) {
    _writeLog(stringOrListOfString);
  } else if (stringOrListOfString is List<String>) {
    stringOrListOfString.forEach(_writeLog);
  } else {
    // Bonus: Can remove this once we have non-nullable types.
    throw ArgumentError.null(logOrListOfLogs);
  }
}

Dart has already taken steps in this direction with FutureOr<T> and will take more steps with non-nullable types. The plan is that a nullable type is effectively sugar for the union of the underlying type and Null. So int? means int | Null. The semantics fall out of that.

I wouldn't be surprised if we eventually get union types, though we don't have plans for it currently. (Non-nullable types will keep us more than busy enough for the immediate future.) Union types are nice, but don't solve as many problems as users think.

Consider +. You'd expect its declaration in the int class to look something like:

class int {
  int | double operator +(int | double rhs) => ...
}

But the union types aren't precise enough. This declaration loses the fact that 1 + 3 should have type int, not int | double. You really want to say "if the parameter type is int, then the return type is int. If the parameter type is double, then the return type is double."

Overloading can express that, but union types can't.

Literal types

The TypeScript example introduces an entirely new feature, singleton types that only contain a single value. That lets you use an == on a property value to determine the type of some surrounding object. It looks to me like a hint of dependent typing.

That's a lot of type system machinery to add, and I'm not sure how useful it is. It quickly falls down if you don't compare to actual literal values. It might be worth looking at, but I'd be surprised if it fit well within a more nominal language like Dart.

Dec 18 '18 18:12 munificent

Thanks @munificent. I agree this is probably a few issues and needs more investigation.

Without a longer reply, my 2 cents:

Dynamic overloading is cool, but not necessary. With potentially implicit downcasts being disabled by default (or going away entirely), you'd have to cast with as in order to even invoke the multi-methods, which in turn means that you might as well just have static overloading only.
Literal types (i.e. tagged unions, @yjbanov will want to say more, I'm sure) are cool. I agree maybe they are a "lot" to add now to (mostly nominal) Dart, but they could potentially add a lot of value in our serialization story (JSON, ProtoBufs, etc).

Dec 18 '18 18:12 matanlurey

#148 introduces 'case functions', which is one way to handle the issues described here. Considering some points raised above:

@matanlurey wrote:

It does not allow dynamic calls

Case functions do allow that.

@munificent wrote:

you're talking about something like multimethods

Right, case functions rely on a simple, user-specified approach to disambiguation (so you won't ever get "ambiguous invocation" errors, which is otherwise a source of a long list of fine papers ;-).

It is guaranteed in some (but not all) cases that the semantics of a case function invocation is exactly the same for a statically resolved case and for a dynamically resolved case, and I expect that this could be subject to 'strict' warnings. For instance, sealed classes would give some useful guarantees.

So you could say that case functions are a pragmatic take on multimethods.

But the union types aren't precise enough. This declaration loses the fact that 1 + 3 should have type int, not int | double.

When giving an argument of type int to a case function whose corresponding case has return type int, we would get the type int for the returned result. If the case is chosen dynamically then we may know less.

Literal Types.

It is probably not too hard to introduce constants as patterns for case functions. But we might want to design a general pattern declaration and matching feature first, such that we can use the same approach everywhere.

Dec 18 '18 18:12 eernstg

I'm curious if the web platform APIs could provide use cases and example problems which we could add to this request?

Web APIs are littered with these.

Search for or on this page: https://firebase.google.com/docs/reference/js/firebase.firestore.Query Search for DartName= in https://github.com/dart-lang/sdk/blob/master/tools/dom/idl/dart/dart.idl

WebIDL explicitly supports Union types - https://www.w3.org/TR/WebIDL-1/#idl-union - so this is always an issue when interfacing with web/JS apis

CC @sethladd

Dec 19 '18 00:12 kevmoo

Another random data point: I've been using C# again recently (for a silly hobby project), and wow, having static overloads is so nice. I'd forgotten how nice they are. Really helps with API design, too.

Dec 21 '18 02:12 jmesserly

Would syntax sugar over callable classes works?

Right now we can achieve the equivalent of named constructors for functions using callable classes:

const someFunction = _SomeFunction();

class _SomeFunction {
  const _SomeFunction();

  void call() {}
  int customName() => 42;
}

which allows

someFunction();
int result = someFunction.customName();

It works but is not very convenient to write.

allowing . in the name of functions may be a good idea:

void someFunction() {}
int someFunction.customName() => 42;

The bonus point here is that since it's not actual method overload, dynamic invocation still works just fine.

Jan 14 '19 00:01 rrousselGit

This issue seems related to #83. Is there a canonical issue for discussing union / sum types? Can we close the others as duplicates?

Adding my 2 cents here

Approach A: Common Superclass

Imagine I want to create a tree data structure that has 2 types of nodes: a LeafNode contains a value, and an InternalNode contains some list of children nodes. I could accomplish this using a super class like this:

abstract class Node {}

class ChildNode extends Node {
    final String value;

    ChildNode(this.value);
}

class InternalNode extends Node {
    final List<Node> children;

    InternalNode(this.children);
}

This works, because I control the definition of both classes, and so I can have them inherit from a common superclass. To make use of a Node, I would have to handle each case of child class as such:

String nodeToString(Node node) {
    if (node is ChildNode) {
        return (node as ChildNode).value;
    }
    assert(node is InternalNode, "Unexpected Node type. Expected a ChildNode or an InternalNode");
    var buffer = StringBuffer();
    for(var child of (node as InternalNode).children) {
        buffer.write(nodeToString(child));
    }
    return buffer.toString();
}

There are 2 things to note here:

The dart type system didn't help me to narrow the type of node within the branches of the program. For example, despite being inside of an if block with node is ChildNode, I still had to cast node as ChildNode before I could access the value property. Similarly, the code after the assertion didn't narrow the type of node to InternalNode.
The dart type system didn't help me to check for the exhaustiveness of the union (ex. what if I didn't handle InternalNode).
Along the lines of 1 and 2 above, if node is ChildNode is false, there's no way for the dart type system to narrow the type of node to InternalNode, because another module could define a different subclass of Node that would be valid here too.

Despite the disadvantages (1-3) above, this approach can work when the child types are written by the developer themselves. However, what about something like an id value that can be a String or an int?

Approach B: Union Wrapper

In the case of a union on types that are not "owned" by the code author, a wrapper class can be created. For example, in the case of a String or int:

class StringOrInt {
    final String? _string;
    final int? _int;

    StringOrInt.fromString(this._string);
    StringOrInt.fromInt(this._int);

    bool get isString => _string != null;
    bool get isInt => _int != null;

    String asString() {
        assert(isString);
        return _string!;
    }

    int asInt() {
        assert(isInt);
        return _int!;
    }
}

One thing to note here is that, in the asFoo methods, dart doesn't infer the type of _foo to be non-null, despite the assert. Either way, here is how the class can be used:

class Foo {
    String name;
    int id;
}

List<Foo> foos;

Foo? findFooByName(String name) {
    // omitted
}

Foo? findFooById(int id) {
    // omitted
}

Foo? findFoo(StringOrInt query) {
    if (query.isString) {
        return findFooByName(query.asString());
    }
    return findFooById(query.asInt());
}

This is definitely awkward to write and use, but it has some advantages:

It works for native types like String or int:
Asserts are implicit. query.asInt() has an assertion that will throw if query doesn't represent an int type. If query.isString returns false, we know it won't throw.
Type casts aren't necessary (they're implicit). When we call query.asString, we're guaranteed a String, not a String?.

Extensions

We can extend this further by adding methods to easily convert from the base type:

extension IntUnion on String {
    StringOrInt toUnion() => StringOrInt.fromString(this);
}

extension StringUnion on int {
    StringOrInt toUnion() => StringOrInt.fromInt(this);
}

Generics

Alternatively, we can generalize this as follows:

class Union<T, U> {
  final T? _left;
  final U? _right;

  Union.left(this._left) : _right = null;
  Union.right(this._right) : _left = null;

  bool get isLeft => _left != null;
  bool get isRight => _right != null;

  T get left {
    assert(isLeft);
    return _left!;
  }

  U get right {
    assert(isRight);
    return _right!;
  }

  dynamic get deref {
    return isLeft ? left : right;
  }
}

This has the advantage of not needing to create type-specific unions for each combination that needs one. The disadvantage is that the naming is very generic. Similarly, you could create a sort of extension method with this as follows:

extension ToUnion on String {
    Union<String, T> toUnion<T>() => Union.left(this);
}

You could use this as follows:

"some string".toUnion();

The problem with this is that it doesn't scale:

extension ToUnion on int {
    Union<int, T> toUnion<T>() => Union.left(this);
}

Union<String, int> query = "name".toUnion<int>(); // this works
query = 42.toUnion<String>(); // this doesn't work. Union<int, String> != Union<String, int>

You could try to fix this using some sort of flip operation:

class Union<T, U> {
    // as before

    Union<U, T> flip() => isLeft ? Union<U, T>.right(left) : Union<U, T>.left(right);
}

The resulting code would work like this:

Union<String, int> query = "name".toUnion<int>();
query = 42.toUnion<String>().flip();

This is quite awkward, and the use of it is even more so: query.left instead of query.asString(), etc.

Approach C: Full-blown `dynamic`

Another approach that seems obvious, but should not be glanced over is just using dynamic:

dynamic query = 2;
query = "abc"; // valid

if (query is int) {
  // do int stuff
} else {
  assert(query is String);
  // do String stuff
}

This suffers from the same drawbacks of approach A with regard to type narrowing, but it can be used on native types like String or int.

Another drawback here is that the type of the variable ends up having to be documented in plain text (ex. "Only call this with a String or an int") which is error-prone at best and also a poor developer experience.

Potential Language Improvements

Improvement 1: Allow mixin extensions

mixin Query {}

extension Queryable on String with Query {}

extension Queryable on int with Query {}

This solves the problem of the superclass approach for native types or types outside of the developer's control. It still has problems with narrowing.

Improvement 2: Improve type narrowing in dart

In the case above, this code:

if (node is ChildNode) {
    return (node as ChildNode).value;
}

Should be valid as this:

if (node is ChildNode) {
    return node.value;
}

And similarly, for this:

bool get isLeft => _left != null;

T get left {
    assert(isLeft);
    return _left!;
}

We should be able to write:

bool get isLeft => _left != null;

T get left {
    assert(isLeft);
    return _left; // can't be null, because of the `assert`
}

Improvement 3: Full support for union types

Supporting a type such as this:

String | int query = "hello";
query = 2;
query = 3.14; // error: `double` can't be assigned to `String | int`.

Open questions

What would happen if union types are used as a generic parameter? For example:

List<String | int> queries = [];

// or this:
var queries = <String | int>[1, 3, 4];

Could this be handled int he same way as nullable types? For example:

List<int?> maybeInts = [];

Alternatively, should nullability be treated as a special case of a union type?

What about the Type object for a union type? Should the dart VM provide some sort of runtime information specific to a union? Should the Type for a union encompass some sort of List<Type>? If so, what about typedefs?

typedef Query = String | int;
typedef QueryBuilder = Query Function();
typedef QueryInput = Query | QueryBuilder;

What would the members of QueryInput's Type be? [String, int, QueryBuilder]? Or [[String, int], QueryBuilder]? Would multiple unions be allowed? Ex. (A | B | C)? Would they be represented as a union of unions? Or a single union? If the former, what is the associativity of the union operator?

I'm open to discussing this further and contributing to speccing something out.

Jul 05 '22 16:07 TzviPM

Bonus idea: Could union types be used as an implicit interface for a sealed type as per https://github.com/dart-lang/language/blob/master/working/0546-patterns/exhaustiveness.md#sealed-types ? If so, we could piggy-back exhaustiveness checking for sealed types off of type narrowing for unions types.

Jul 05 '22 16:07 TzviPM

@TzviPM Your "Improvement 2" already works. You may try the following code, that uses methods based on type promotion (.toUpperCase, from String, or isOdd, from int).

import 'dart:math';

void main() {
  final value = Random().nextBool() ? "I'm a String" : 10;

  if (value is String) {
    print(value.toUpperCase());
  } else if (value is int) {
    print('Is odd? ${value.isOdd}');
  }
}

Regarding the assert, it doesn't work because of its dynamic nature, but in other cases it will promote correctly, for instance:

Left useLeftAsNonNullable() {
  final left = getLeft(); // Nullable
  final isLeft = left != null;

  if (!isLeft) throw StateError('Left should not be null here!');
  return left;
}

Jul 05 '22 20:07 mateusfccp

@TzviPM wrote:

The dart type system didn't help me to narrow the type of node within the branches of the program

[Edit: Checking again, node wasn't an instance variable, so the comment I wrote about that was not relevant.]

But promotion works just fine with a current version of dart, and some small adjustments of the code:

abstract class Node {}

class ChildNode extends Node {
  final String value;
  ChildNode(this.value);
}

class InternalNode extends Node {
  final List<Node> children;
  InternalNode(this.children);
}

String nodeToString(Node node) {
  if (node is ChildNode) {
    return node.value;
  } else if (node is InternalNode) {
    var buffer = StringBuffer();
    for (var child in node.children) {
      buffer.write(nodeToString(child));
    }
    return buffer.toString();
  } else {
    throw "Unexpected subtype of Node";    
  }
}

void main() {
  print(nodeToString(ChildNode('A node')));
}

The dart type system didn't help me to check for the exhaustiveness of the union

Dart is quite likely to introduce a notion of 'sealed' or 'switch' classes, and this concept is specifically aimed at enabling exhaustiveness checks. So there is no solution right now, but that is likely to change.

Jul 05 '22 20:07 eernstg

You might be interested in a new package, https://pub.dev/packages/extension_type_unions. More details in this comment.

Feb 20 '24 15:02 eernstg

language language copied to clipboard

Users want to define union or union-like APIs

APIs that take nominal types A or B

Problems:

Solutions

APIs that structural types A or B

Problems

Solutions

Overloading

Dynamic overloading

Static overloading

Union types

Literal types

Adding my 2 cents here

Approach A: Common Superclass

Approach B: Union Wrapper

Extensions

Generics

Approach C: Full-blown dynamic

Potential Language Improvements

Improvement 1: Allow mixin extensions

Improvement 2: Improve type narrowing in dart

Improvement 3: Full support for union types

Open questions

language
language copied to clipboard

Approach C: Full-blown `dynamic`