language
language copied to clipboard
Users want to define union or union-like APIs
Based on conversations with @yjbanov, @leonsenft, @mdebbar.
Currently, the Dart language lacks a way to provide static union or union-like semantics or APIs. Multiple other platforms take different approaches - anything from user-definable union types, algebraic/tagged unions, method overloading, and I'm sure other approaches we missed.
Let's look at two examples:
APIs that take nominal types A or B
void writeLogs(Object stringOrListOfString) {
if (stringOrListOfString is String) {
_writeLog(stringOrListOfString);
} else if (stringOrListOfString is List<String>) {
stringOrListOfString.forEach(_writeLog);
} else {
throw ArgumentError.value(stringOrListOfString, 'Not a String or List<String>');
}
}
Problems:
- No static type safety. The user can pass an
Octopus
, and only receive an error at runtime:
void main() {
// No static error.
// Runtime error: "Instance of 'Octopus': Not a String or List<String>".
writeLogs(Octopus());
}
- Relies on complex TFA for optimizations, which fall apart with dynamic access:
void main() async {
// Inferred as "dynamic" for one reason or another.
var x = something.foo().bar();
// No static error. Even if it succeeds, all code paths are now retained (disables tree-shaking).
writeLogs(x);
}
Solutions
- A clever use can simply just write two functions:
void writeLog(String log) {
_writeLog(log);
}
void writeLogList(List<String> logs) {
logs.forEach(_writeLog);
}
... unfortunately, this now means you often need to think of convoluted API names like writeLogList
.
- Something like user-definable union types:
void writeLog(String | List<String> logOrListOfLogs) {
if (stringOrListOfString is String) {
_writeLog(stringOrListOfString);
} else if (stringOrListOfString is List<String>) {
stringOrListOfString.forEach(_writeLog);
} else {
// Bonus: Can remove this once we have non-nullable types.
throw ArgumentError.null(logOrListOfLogs);
}
}
... unfortunately this (a) Can't have different return types, and (b) might have complex side-effects with reified types (i.e. expensive performance reifying and storing writeLog<T>(T | List<T> | Map<T, List<T> | ....)
, and (c) just looks ugly compared to the rest of the language.
@yjbanov did mention a first-class match
or when
could help with (c)
, but not (a)
or (b)
:
void writeLog(String | List<String> logOrListOfLogs) {
when (logOrListOfLogs) {
String: {
_writeLog(logOrListOfLogs);
}
List<String>: {
logOrListOfLogs.forEach(_writeLog);
}
Null: {
// Bonus: Can remove this once we have non-nullable types.
throw ArgumentError.null(logOrListOfLogs);
}
}
}
- Something like user-definable method overloads (my preference in this scenario):
void writeLog(String log) {
_writeLog(log);
}
void writeLog(List<String> logs) {
logs.forEach(_writeLog);
}
... this solves all of the above concerns. It does not allow dynamic calls, but neither will static extension methods and neither do, say, named constructors or separate methods (used today), so I don't see this as a net negative.
APIs that structural types A or B
@dantup ran into this while defining Microsoft Language Service protocols. Imagine the following JSON:
// success.json
{
"status": "SUCCESS"
}
// failure.json
{
"status": "ERROR",
"reason": "AUTHENTICATION_REQUIRED"
}
Modeling this in Dart is especially difficult:
void main() async {
Map<String, Object> response = await doThing();
final status = response['status'] as String;
if (status == 'SUCCESS') {
print('Success!');
} else if (status == 'ERROR') {
print('Failed: ${response['reason']}');
}
}
You can write this by hand, of course, but imagine large auto-generated APIs for popular services. At some point you'll drop down to using code generation, and it's difficult to generate a good, static, model for this.
Problems
Let's imagine we get value types or data classes of some form, and let's even assume NNBD to boot.:
data class Response {
String status;
String? reason;
}
This works, but like the problems in the nominal types above, you need runtime checks to use the API correctly. This can get very very nasty on giant, popular APIs (like Microsoft's Language Service, but many many others including Google's own):
void main() async {
var response = await getResponse();
// Oops; this will never trigger, because we did not capitalize 'ERROR'.
if (response.status == 'error') {
print('ERROR!');
return;
}
// Oops; this will print 'Yay: null' because success messages do not have a reason field.
if (response.status == 'SUCCESS') {
print('Yay: ${response.reason}');
return;
}
}
Solutions
One way this could be solved is having user-definable tagged unions.
TypeScript would model this as:
type Response = IResponseSuccess | IResponseFailure;
interface IResponseSuccess {
status: "SUCCESS";
}
interface IResponseFailure {
status: "ERROR";
reason: string;
}
async function example_1() {
const response = await getResponse();
// Static error: "status" must be "SUCCESS" or "ERROR", got "error".
if (response.status == 'error') {
console.log('ERROR!');
return;
}
}
async function example_2() {
const response = await getResponse();
if (response.status == 'ERROR') {
console.log('ERROR!');
return;
}
// Automatically promotes "response" to "IResponseSuccess"!
// Static error: "reason" does not exist on "IResponseSuccess".
console.log('Yay: ', response.reason);
}
It does not allow dynamic calls
Is that true? Can't dynamic dispatch to writeLog
be implemented as a wrapper on top of the two functions? There's will be dispatch cost, of course, but we're talking about dynamic
anyway. I don't think you're worried about method dispatch performance at that point. Without overloads you'd have to do type checks anyway, as your void writeLogs(Object stringOrListOfString)
demonstrates.
Can't dynamic dispatch to
writeLog
be implemented as a wrapper on top of the two functions?
It could. It does mean though, for overloads at least, you do not know the return type. In practice I'm not sure this is worth it. If it was a feature specifically for trying to help migrate existing (non-overloaded) APIs to overload-based ones, I could see value in that.
In cases where the compiler can't statically determine which overload to call, it could use a union type for the return:
C foo(A a) {}
D foo(B b) {}
void bar(Object obj) {
var result = foo(obj); // The compiler would infer the type of `result` as `C | D`.
}
@matanlurey
It does mean though, for overloads at least, you do not know the return type.
What does it mean to know the return type in dynamic dispatch?
@matanlurey
In practice I'm not sure this is worth it. If it was a feature specifically for trying to help migrate existing (non-overloaded) APIs to overload-based ones, I could see value in that.
I agree. I also do not see a lot of value in dynamic dispatch as of Dart 2. But that's different from saying that overloads do not support dynamic dispatch. They do. The question is whether we want it.
@mdebbar Right, that's a second type of dispatch. Unless @matanlurey and I misunderstood each other, we were talking about dispatching d.foo(a)
where d
is dynamic
. What you are talking about is when a
is dynamic
. Both kinds of dispatches need to be decided upon.
I'm curious if the web platform APIs could provide use cases and example problems which we could add to this request?
https://github.com/Microsoft/TypeScript/blob/master/lib/lib.dom.d.ts is a good source of web platform examples (look for |
in that file).
There are a couple of separate pieces here that I want to try to tease out to understand better. That way we can be more precise about what the actual user need is.
Overloading
This is the ability to have two methods with the same name but different parameter lists. In your example, it's:
void writeLog(String log) {
_writeLog(log);
}
void writeLog(List<String> logs) {
logs.forEach(_writeLog);
}
One key question for this is whether overloads should be chosen dynamically or statically. Given:
Object log;
if (isMonday) {
log = "A string";
} else {
log = ["Some", "strings"];
}
writeLog(log);
Would you expect this to do the right thing on all days of the week? Or is this a static error because it doesn't know which overload to call at compile-time?
Which answer you choose has profound impact on the design...
Dynamic overloading
If the dispatch does happen at runtime, then you're talking about something like multimethods—runtime dispatch of methods based on the types of their parameters. This is a really cool, powerful feature. It's also very rare in object-oriented languages.
Doing this would let us do things in Dart that few other languages can do, but it could also be fiendishly complex. Consider:
int weird(String s) => 3;
bool weird(List l) => true;
main() {
var fn = weird;
Object unknown;
var o = fn(unknown);
}
What is the static type of fn
? What is the static type of o
?
Static overloading
This is what C++, Java, C#, etc. do. It's definitely well-explored territory. It solves several real, concrete problems. For example, in Dart, adding a method to a base class may always be a breaking change because some subclass could have a method with the same name but a different signature. In the listed languages, that's much safer. If the signature is compatible, there's no problem. If it isn't, it just becomes a separate overload. The only risk if there's a compatible signature but an incompatible return type.
Static overloading also has a deserved reputation for adding a ton of complexity to the language. It complicates generics and implicit conversions, sometimes leads to exponential performance cliffs during type-checking, and confuses users.
Union types
This is the ability to define a structural type that permits all values of any two given types. That's:
void writeLog(String | List<String> logOrListOfLogs) {
if (stringOrListOfString is String) {
_writeLog(stringOrListOfString);
} else if (stringOrListOfString is List<String>) {
stringOrListOfString.forEach(_writeLog);
} else {
// Bonus: Can remove this once we have non-nullable types.
throw ArgumentError.null(logOrListOfLogs);
}
}
Dart has already taken steps in this direction with FutureOr<T>
and will take more steps with non-nullable types. The plan is that a nullable type is effectively sugar for the union of the underlying type and Null. So int?
means int | Null
. The semantics fall out of that.
I wouldn't be surprised if we eventually get union types, though we don't have plans for it currently. (Non-nullable types will keep us more than busy enough for the immediate future.) Union types are nice, but don't solve as many problems as users think.
Consider +
. You'd expect its declaration in the int
class to look something like:
class int {
int | double operator +(int | double rhs) => ...
}
But the union types aren't precise enough. This declaration loses the fact that 1 + 3
should have type int
, not int | double
. You really want to say "if the parameter type is int, then the return type is int. If the parameter type is double, then the return type is double."
Overloading can express that, but union types can't.
Literal types
The TypeScript example introduces an entirely new feature, singleton types that only contain a single value. That lets you use an ==
on a property value to determine the type of some surrounding object. It looks to me like a hint of dependent typing.
That's a lot of type system machinery to add, and I'm not sure how useful it is. It quickly falls down if you don't compare to actual literal values. It might be worth looking at, but I'd be surprised if it fit well within a more nominal language like Dart.
Thanks @munificent. I agree this is probably a few issues and needs more investigation.
Without a longer reply, my 2 cents:
-
Dynamic overloading is cool, but not necessary. With potentially implicit downcasts being disabled by default (or going away entirely), you'd have to cast with
as
in order to even invoke the multi-methods, which in turn means that you might as well just have static overloading only. -
Literal types (i.e.
tagged unions
, @yjbanov will want to say more, I'm sure) are cool. I agree maybe they are a "lot" to add now to (mostly nominal) Dart, but they could potentially add a lot of value in our serialization story (JSON, ProtoBufs, etc).
#148 introduces 'case functions', which is one way to handle the issues described here. Considering some points raised above:
@matanlurey wrote:
It does not allow dynamic calls
Case functions do allow that.
@munificent wrote:
you're talking about something like multimethods
Right, case functions rely on a simple, user-specified approach to disambiguation (so you won't ever get "ambiguous invocation" errors, which is otherwise a source of a long list of fine papers ;-).
It is guaranteed in some (but not all) cases that the semantics of a case function invocation is exactly the same for a statically resolved case and for a dynamically resolved case, and I expect that this could be subject to 'strict' warnings. For instance, sealed classes would give some useful guarantees.
So you could say that case functions are a pragmatic take on multimethods.
But the union types aren't precise enough. This declaration loses the fact that
1 + 3
should have typeint
, notint | double
.
When giving an argument of type int
to a case function whose corresponding case has return type int
, we would get the type int
for the returned result. If the case is chosen dynamically then we may know less.
Literal Types.
It is probably not too hard to introduce constants as patterns for case functions. But we might want to design a general pattern declaration and matching feature first, such that we can use the same approach everywhere.
I'm curious if the web platform APIs could provide use cases and example problems which we could add to this request?
Web APIs are littered with these.
Search for or
on this page: https://firebase.google.com/docs/reference/js/firebase.firestore.Query
Search for DartName=
in https://github.com/dart-lang/sdk/blob/master/tools/dom/idl/dart/dart.idl
WebIDL explicitly supports Union types - https://www.w3.org/TR/WebIDL-1/#idl-union - so this is always an issue when interfacing with web/JS apis
CC @sethladd
Another random data point: I've been using C# again recently (for a silly hobby project), and wow, having static overloads is so nice. I'd forgotten how nice they are. Really helps with API design, too.
Would syntax sugar over callable classes works?
Right now we can achieve the equivalent of named constructors for functions using callable classes:
const someFunction = _SomeFunction();
class _SomeFunction {
const _SomeFunction();
void call() {}
int customName() => 42;
}
which allows
someFunction();
int result = someFunction.customName();
It works but is not very convenient to write.
allowing .
in the name of functions may be a good idea:
void someFunction() {}
int someFunction.customName() => 42;
The bonus point here is that since it's not actual method overload, dynamic invocation still works just fine.
This issue seems related to #83. Is there a canonical issue for discussing union / sum types? Can we close the others as duplicates?
Adding my 2 cents here
Approach A: Common Superclass
Imagine I want to create a tree data structure that has 2 types of nodes: a LeafNode
contains a value, and an InternalNode
contains some list of children
nodes. I could accomplish this using a super class like this:
abstract class Node {}
class ChildNode extends Node {
final String value;
ChildNode(this.value);
}
class InternalNode extends Node {
final List<Node> children;
InternalNode(this.children);
}
This works, because I control the definition of both classes, and so I can have them inherit from a common superclass. To make use of a Node
, I would have to handle each case of child class as such:
String nodeToString(Node node) {
if (node is ChildNode) {
return (node as ChildNode).value;
}
assert(node is InternalNode, "Unexpected Node type. Expected a ChildNode or an InternalNode");
var buffer = StringBuffer();
for(var child of (node as InternalNode).children) {
buffer.write(nodeToString(child));
}
return buffer.toString();
}
There are 2 things to note here:
- The dart type system didn't help me to narrow the type of
node
within the branches of the program. For example, despite being inside of anif
block withnode is ChildNode
, I still had to castnode as ChildNode
before I could access thevalue
property. Similarly, the code after the assertion didn't narrow the type ofnode
toInternalNode
. - The dart type system didn't help me to check for the exhaustiveness of the union (ex. what if I didn't handle
InternalNode
). - Along the lines of 1 and 2 above, if
node is ChildNode
is false, there's no way for the dart type system to narrow the type ofnode
toInternalNode
, because another module could define a different subclass ofNode
that would be valid here too.
Despite the disadvantages (1-3) above, this approach can work when the child types are written by the developer themselves. However, what about something like an id
value that can be a String
or an int
?
Approach B: Union Wrapper
In the case of a union on types that are not "owned" by the code author, a wrapper class can be created. For example, in the case of a String
or int
:
class StringOrInt {
final String? _string;
final int? _int;
StringOrInt.fromString(this._string);
StringOrInt.fromInt(this._int);
bool get isString => _string != null;
bool get isInt => _int != null;
String asString() {
assert(isString);
return _string!;
}
int asInt() {
assert(isInt);
return _int!;
}
}
One thing to note here is that, in the asFoo
methods, dart doesn't infer the type of _foo
to be non-null, despite the assert
. Either way, here is how the class can be used:
class Foo {
String name;
int id;
}
List<Foo> foos;
Foo? findFooByName(String name) {
// omitted
}
Foo? findFooById(int id) {
// omitted
}
Foo? findFoo(StringOrInt query) {
if (query.isString) {
return findFooByName(query.asString());
}
return findFooById(query.asInt());
}
This is definitely awkward to write and use, but it has some advantages:
- It works for native types like
String
orint
: - Asserts are implicit.
query.asInt()
has an assertion that will throw ifquery
doesn't represent anint
type. Ifquery.isString
returns false, we know it won't throw. - Type casts aren't necessary (they're implicit). When we call
query.asString
, we're guaranteed aString
, not aString?
.
Extensions
We can extend this further by adding methods to easily convert from the base type:
extension IntUnion on String {
StringOrInt toUnion() => StringOrInt.fromString(this);
}
extension StringUnion on int {
StringOrInt toUnion() => StringOrInt.fromInt(this);
}
Generics
Alternatively, we can generalize this as follows:
class Union<T, U> {
final T? _left;
final U? _right;
Union.left(this._left) : _right = null;
Union.right(this._right) : _left = null;
bool get isLeft => _left != null;
bool get isRight => _right != null;
T get left {
assert(isLeft);
return _left!;
}
U get right {
assert(isRight);
return _right!;
}
dynamic get deref {
return isLeft ? left : right;
}
}
This has the advantage of not needing to create type-specific unions for each combination that needs one. The disadvantage is that the naming is very generic. Similarly, you could create a sort of extension method with this as follows:
extension ToUnion on String {
Union<String, T> toUnion<T>() => Union.left(this);
}
You could use this as follows:
"some string".toUnion();
The problem with this is that it doesn't scale:
extension ToUnion on int {
Union<int, T> toUnion<T>() => Union.left(this);
}
Union<String, int> query = "name".toUnion<int>(); // this works
query = 42.toUnion<String>(); // this doesn't work. Union<int, String> != Union<String, int>
You could try to fix this using some sort of flip
operation:
class Union<T, U> {
// as before
Union<U, T> flip() => isLeft ? Union<U, T>.right(left) : Union<U, T>.left(right);
}
The resulting code would work like this:
Union<String, int> query = "name".toUnion<int>();
query = 42.toUnion<String>().flip();
This is quite awkward, and the use of it is even more so: query.left
instead of query.asString()
, etc.
Approach C: Full-blown dynamic
Another approach that seems obvious, but should not be glanced over is just using dynamic:
dynamic query = 2;
query = "abc"; // valid
if (query is int) {
// do int stuff
} else {
assert(query is String);
// do String stuff
}
This suffers from the same drawbacks of approach A with regard to type narrowing, but it can be used on native types like String
or int
.
Another drawback here is that the type of the variable ends up having to be documented in plain text (ex. "Only call this with a String
or an int
") which is error-prone at best and also a poor developer experience.
Potential Language Improvements
Improvement 1: Allow mixin extensions
mixin Query {}
extension Queryable on String with Query {}
extension Queryable on int with Query {}
This solves the problem of the superclass approach for native types or types outside of the developer's control. It still has problems with narrowing.
Improvement 2: Improve type narrowing in dart
In the case above, this code:
if (node is ChildNode) {
return (node as ChildNode).value;
}
Should be valid as this:
if (node is ChildNode) {
return node.value;
}
And similarly, for this:
bool get isLeft => _left != null;
T get left {
assert(isLeft);
return _left!;
}
We should be able to write:
bool get isLeft => _left != null;
T get left {
assert(isLeft);
return _left; // can't be null, because of the `assert`
}
Improvement 3: Full support for union types
Supporting a type such as this:
String | int query = "hello";
query = 2;
query = 3.14; // error: `double` can't be assigned to `String | int`.
Open questions
What would happen if union types are used as a generic parameter? For example:
List<String | int> queries = [];
// or this:
var queries = <String | int>[1, 3, 4];
Could this be handled int he same way as nullable types? For example:
List<int?> maybeInts = [];
Alternatively, should nullability be treated as a special case of a union type?
What about the Type
object for a union type? Should the dart VM provide some sort of runtime information specific to a union? Should the Type
for a union encompass some sort of List<Type>
? If so, what about typedef
s?
typedef Query = String | int;
typedef QueryBuilder = Query Function();
typedef QueryInput = Query | QueryBuilder;
What would the members of QueryInput
's Type
be? [String, int, QueryBuilder]
? Or [[String, int], QueryBuilder]
? Would multiple unions be allowed? Ex. (A | B | C)
? Would they be represented as a union of unions? Or a single union? If the former, what is the associativity of the union operator?
I'm open to discussing this further and contributing to speccing something out.
Bonus idea: Could union types be used as an implicit interface for a sealed type as per https://github.com/dart-lang/language/blob/master/working/0546-patterns/exhaustiveness.md#sealed-types ? If so, we could piggy-back exhaustiveness checking for sealed types off of type narrowing for unions types.
@TzviPM Your "Improvement 2" already works. You may try the following code, that uses methods based on type promotion (.toUpperCase
, from String
, or isOdd
, from int
).
import 'dart:math';
void main() {
final value = Random().nextBool() ? "I'm a String" : 10;
if (value is String) {
print(value.toUpperCase());
} else if (value is int) {
print('Is odd? ${value.isOdd}');
}
}
Regarding the assert
, it doesn't work because of its dynamic nature, but in other cases it will promote correctly, for instance:
Left useLeftAsNonNullable() {
final left = getLeft(); // Nullable
final isLeft = left != null;
if (!isLeft) throw StateError('Left should not be null here!');
return left;
}
@TzviPM wrote:
The dart type system didn't help me to narrow the type of
node
within the branches of the program
[Edit: Checking again, node
wasn't an instance variable, so the comment I wrote about that was not relevant.]
But promotion works just fine with a current version of dart, and some small adjustments of the code:
abstract class Node {}
class ChildNode extends Node {
final String value;
ChildNode(this.value);
}
class InternalNode extends Node {
final List<Node> children;
InternalNode(this.children);
}
String nodeToString(Node node) {
if (node is ChildNode) {
return node.value;
} else if (node is InternalNode) {
var buffer = StringBuffer();
for (var child in node.children) {
buffer.write(nodeToString(child));
}
return buffer.toString();
} else {
throw "Unexpected subtype of Node";
}
}
void main() {
print(nodeToString(ChildNode('A node')));
}
The dart type system didn't help me to check for the exhaustiveness of the union
Dart is quite likely to introduce a notion of 'sealed' or 'switch' classes, and this concept is specifically aimed at enabling exhaustiveness checks. So there is no solution right now, but that is likely to change.
You might be interested in a new package, https://pub.dev/packages/extension_type_unions. More details in this comment.