language icon indicating copy to clipboard operation
language copied to clipboard

[extension types] Let `T` be assignable to a fully transparent extension type with representation type `T`

Open eernstg opened this issue 5 months ago • 10 comments

This issue is a proposal to add an assignability relationship. Informally, it goes as follows: Assume that E is an extension type with representation type R which is a non-extension type that is also a superinterface of E. In this case the type R and subtypes thereof are assignable to the type E. For example:

extension type FancyString(String it) implements String {
  // ...
}

FancyString x = 'Hello '; // OK.
FancyString f() => 'world!'; // OK.

The conceptual justification for this assignability rule is that FancyString "announces to clients" that it is an extension type that has a non-extension representation type which is related to String by having implements String (this could be directly or indirectly), and the superinterface and the representation type are exactly the same type, not just subtype-related. In other words, the representation type of FancyString is no secret at all.

In this case we say that the extension type is a fully transparent extension type.

We need to make sure that the assignability is applicable to composite cases: Assume again that E is a fully transparent extension type with representation type R. When R is assignable to E we also want List<R> to be assignable to List<E>, and R Function(E) should be assignable to E Function(R). (Thanks to @lrhn for making this point at a time where I hadn't thought through how we'd deal with structural types!) The crucial point is that there are no run-time actions associated with this kind of assignability, which means that a function of type R Function(E) will behave exactly like a function of type E Function(R) at run time, and hence we can allow one to occur where the other is expected. Regular subtyping allows E Function(R) to occur where R Function(E) is expected, and this new kind of assignability allows the converse.

We introduce the covariant fully transparent erasure and the contravariant fully transparent erasure of a type, the former also being known as the fully transparent erasure. Both of these functions replace an extension type by the corresponding representation type, recursively and on all subterms. However, the covariant erasure only does this on types that occur in a covariant position and the contravariant erasure only does this on types that occur in a contravariant position, and the covariant erasure invokes the contravariant erasure on subterms in a contravariant position and vice versa.

For example, FancyString erases (covariantly) to String, FancyString Function(FancyString) erases to String Function(FancyString), FancyString erases contravariantly to FancyString, and FancyString Function(FancyString) erases contravariantly to FancyString Function(String).

With that terminology in place, we can restate the proposal:

A type T is assignable to a type S if the contravariant fully transparent erasure of T is a subtype of the covariant fully transparent erasure of S.

The rule could have used "is assignable to" rather than "is a subtype of", but this is a possible future enhancement, and we prefer to avoid the complexity of having multiple transformations on the same expression.

We have in several other ways used the fact that an extension type E implements a certain non-extension type I as a signal that E is associated with I (for instance "E is-an I" in the sense that E is a subtype of I, and E has all the non-redeclared members of the interface of I). This proposal takes that perspective one step further and introduces assignability from the representation type to the extension type (which is the opposite direction of what we already have, based on the subtype relationship).

Consider the well-known example that demonstrates the potential for run-time type errors based on dynamically checked covariance:

void main() {
  List<int> xs = [1];
  xs.add(1); // OK at compile time and run time.
  List<num> ys = xs; // Also OK all the way.
  ys.add(1.5); // Throws a type error at run time.
}

If we wish to adapt the treatment of the type List such that it is considered to be invariant then we can introduce an extension type with a phantom type argument:

// ----- Library 'invariant_list.dart'.
import 'dart:core' as core show List;
import 'dart:core' hide List;

typedef Inv<X> = X Function(X);
typedef List<X> = _List<X, Inv<X>>;

extension type const _List<X, I extends Inv<X>>(core.List<X> _it)
    implements core.List<X> {}

// ----- Library 'my_library.dart'.
import 'invariant_list.dart';

void main() {
  List<int> xs = [1]; // OK: `core.List<int>` is assignable to `List<int>`.
  xs.add(1); // OK at compile time and run time.
  List<num> ys = xs; // Compile-time error.
  ys.add(1.5); // Unsafe operation irrelevant, declaration of `ys` is an error.

  // Example from @lrhn
  List<List<int>> ys = [[1], [2]]; // OK: `core.List<core.List<int>> <: core.List<core.List<int>>`.

  // Using contravariance.
  void f(List<num> xs) => print(xs.length);
  void g(void Function(core.List<int>) h) {}
  g(f); // OK: `void Function(core.List<num>) <: void Function(core.List<int>)`.
}

However, the initialization List<int> xs = [1]; will only succeed if this proposal is adopted. Otherwise it's a compile-time error because the type List<int> from 'invariant_list.dart' isn't a supertype of the type List<int> from 'dart:core'. In that case we'd have to use List<int> xs = List([1]);, which implies that the List type from 'invariant_list.dart' is no more a drop-in replacement for the List type from 'dart:core'.

During inference, a fully transparent extension type E with representation type R would give rise to R as the context type. Note that an expression with context type R could have type E or a subtype thereof (which is of course assignable to E, and also a subtype of the context type), or it could have type R or a subtype thereof (which is now also assignable to E).

Note that this proposal basically subsumes the proposal in https://github.com/dart-lang/language/issues/3607: We would just include an extra rule to say that a fully transparent extension type E with representation type R can be the return type of an async/async*/sync* function if and only if R can be said return type; in this case, the future value type / element type of the function is computed from R.

@dart-lang/language-team, WDYT?

[Edit: Feb 14: Generalized assignability to be structural.]

eernstg avatar Feb 14 '24 10:02 eernstg

which is a non-extension type that is also a superinterface of E

Does it have to be a non-extension type? It should work just as well adding extension types on extension types. (If the representation type is an extension type, should we recurse or not? I'm fine with "not". It's what we do for allowing superinterfaces, so it's consistent.)

Does it have to be the same type? (Because I really don't like it when things that depend on two types being the same.)

It's not restricted to immediate superinterfaces, which is good.

Maybe:

Let E be an extension type with representation type R. If there exists a type S such that E has S as superinterface and S <: R, then R is assignable to E.

Then S and the representation type do not have to be exactly the same type, one can be List<Object?> and the other List<dynamic>, but they will be mutual subtypes. That should be enough to still justify the assignability, if being the same type does.

This only defines assignability, which is effectively an (infalliable) implicit casting coercion. If e has type S, context type is E, then implicitly cast to E. Since S<:R, that's always sound.

We should probably consider if/how this coercion interacts with other coercions. It probably does not, since extension types cannot implement function types or Function, so a context type which is an extension type cannot also be a type that we do .call or instantiation coercions to. And if the expression type is dynamic, we'll do the unsafe downcast to extension type anyway.

It's assignability, not subtyping, so it's only skin-deep. Introducing a replacement for, fx, List only works with a context type. While this works:

var list = [1];
List<int> intList = list;

this will not:

var lists = [[1],[2]];
List<List<int>> intLists = lists;

The latter will see an assignment from core.List<core.List<int>> to ext.List<ext.List<int>>, with only one assignment which can coerce. The rule here does not apply because core.List<core.List<int>> is not a subtype of core.List<ext.List<int>>, the representation type of ext.List<ext.List<int>>.

Assignability is not subtyping, because it doesn't work on nested types. However this works, and should have no runtime overhead:

var lists = [[1],[2]];
List<List<int>> intLists = lists as List<List<int>>;

so are we giving people half a feature here?

If the only use-cases are something like returning a Future from an async function declared as returning FancyFuture<T> implements Future<T>, then this is probably enough. We'll have to update a some places to say that Future<T> must be assignable to the declared return type, not only a subtype (and we can feel really clever about not allowing implicit extension method .call tear-off, because that would have made it an awful mess). It's doable.

But if we have to touch all those places anyway, maybe just doing the special casing there is enough:

If Future<T> is a subtype of F or F is an extension type with representation type R, and there is an interface S s.t, F implements S, Future<T><:S and S <: R, then ...

(We can introduce a predicate for that, "F is representable by T", which means T <: F or F is extension type with representation type R and a superinterface S s.t. S <: R and T <: S. Or a predicate on extension types, "E is fully transparent if E has representation type R and a superinterface S and S <: R. Then "F is representable by T" if T <: F or F is a fully transparent extension type with representation type R and T <: R. Specification options are plenty.)

lrhn avatar Feb 14 '24 15:02 lrhn

Great questions!

Does it have to be a non-extension type?

No. However, it is my impression that the variant with the non-extension type is conceptually simple and well justified (that particular kind of extension type really doesn't hide the type of the representation object). With an extension type whose representation type is yet another extension type, it's much less obvious to me that we can find a similarly natural and convincing conceptual perspective.

In any case, it would be very easy to generalize this feature such that it allows some extra assignability relationships later on, and the change would be non-breaking.

Does it have to be the same type?

That is indeed kind of peculiar. However, I wanted to allow the notion of being "fully transparent" as narrow. The language team did not support adding keywords to make this kind of distinction, so we have to use something else, and this notion of being fully transparent does seem rather natural to me.

In particular, consider the following example:

extension type E(int _) implements num {}

If int is assignable to E then we can have E e = 42;, which does make sense but seems like a violation of encapsulation.

If num is assignable to E (so int is also, because it's a subtype) then we maintain the usual encapsulation, but it is surprising (and probably not a good idea) if this implies that the assignability includes a dynamic type check (E e = 1.5; would be statically OK because 1.5 is a num, but it throws at run time because the value must be an int).

It's assignability, not subtyping, so it's only skin-deep.

Right, I did not want to make it a subtyping property. In particular, we could have a subtyping relationship to another extension type that isn't fully transparent, which would cause some anomalies:

extension type E1(int _) {}
extension type E2(int _) implements int, E1 {}

void main() {
  E1 e1 = E1(1);
  E2 e2 = E2(1);
  int i = 1;
  e2 = i; // OK, `int <: E2`, not in the proposal, we're just exploring that option.
  e1 = e2; // OK, `E2 <: E1`.
  e1 = i; // Compile-time error?
}

However, it's a very good point that we need to handle composite cases in order to deliver a sufficiently convenient behavior. I did think about that, but forgot to dive into it.

We would use a rule which is structural, such that core.List<core.List<int>> is assignable to ext.List<ext.List<int>>. The criterion would be that the source type is a subtype of the fully transparent extension type erasure of the target type, and said erasure would work just like the normal extension type erasure, except that it would only replace an extension type by its corresponding representation type when the extension type is fully transparent.

this will not:

var lists = [[1],[2]];
List<List<int>> intLists = lists;

With the structural rule mentioned above, it would work. I think it should work, too. I adjusted the proposal to include this criterion.

If the only use-cases are something like returning a Future from an async function

Embellishing the return type by means of this mechanism is likely to be an important use case, but there are many other situations where assignability is used. For example, passing actual arguments to functions, initializing or assigning to variables (any kind), specifying elements in collection literals, and more.

eernstg avatar Feb 15 '24 09:02 eernstg

I think this feature should be optional with implicit constructors. Because, one of my favorit use cases of extension types is smart typedefs to which the representation types aren't assignable.

extension type Height(double _) implements double {}
extension type Weight(double _) implements double {}

double calcBmi(Height height, Weight weight) => weight / ( height * height);

void main() {
  var height = Height(1.64);
  var weight = Weight(54);
  var bmi = calcBmi(height, weight);
  print(bmi); // 20.077334919690664
  bmi = calcBmi(1.64, 54.0); // compile-time error
  bmi = calcBmi(weight, height); // compile-time error
}

Cat-sushi avatar Mar 25 '24 12:03 Cat-sushi

You could indeed use implicit constructors to achieve a similar effect.

However, implicit constructors as proposed here is a different mechanism: more general in some ways and less general in other ways.

It is more general because the operation that produces the desired type of object from a given expression of a different type can inject arbitrary computations, and because it can enable a conversion from any type to any other type. For example:

// Hypothetical code: assumes that implicit constructors have been added to Dart.

// Suppose we want to transform `int`s into `String`s, implicitly.
static extension on String {
  implicit factory String.fromInt(int i) => i.toString();
}

void main() {
  String s = 42; // OK, becomes `String s = String.fromInt(42);`.
}

A somewhat similar mechanism has been provided in Scala for several years (see https://docs.scala-lang.org/scala3/reference/contextual/conversions.html). The experience from the Scala community is that it may be difficult to keep track of implicit conversions and hence they should be made explicit in some way (somewhere in the same file, at least).

Following that insight, it is required in the Dart proposal that each individual imported implicit constructor which is capable of being used implicitly must explicitly be enabled in the import clause. So you will know for sure that no implicit conversions can take place unless the implicit constructor is declared in the current library, or the implicit constructor is imported, and it is enabled in the corresponding import clause.

In particular, an implicit coercion from the representation type R of an extension type E could be handled by a trivial implicit constructor (it doesn't actually have to do anything, we just give the same object a new type). So in that sense, implicit constructors are much more powerful than what's needed for the proposal in this issue.

However, implicit constructor invocations are less general than the mechanism proposed here in another sense: The mechanism proposed here allows structural lifting of the assignability (which is possible exactly because there's no need to perform any actual computations, we're just giving some objects a new type).

// Hypothetical code: Assumes the feature proposed in this issue.

extension type Height(double _) implements double {}
extension type Weight(double _) implements double {}

double calcBmi(Height height, Weight weight) => weight / ( height * height);

void main() {
  Map<Height, Weight> map;
  map = {1.64: 54.0, 1.84: 77.8, 1.86: 92.2};
  for (var MapEntry(key: h, value: w) in map.entries) {
    print('BMI: ${calcBmi(h, w)}');
  }
}

For the assignment to map, the right hand side has type Map<double, double>. This type is assignable to Map<Height, Weight> because double is assignable to Height and double is assignable to Weight. This structural generalization is not viable in the case where the conversion is performed by means of actual computations, which means that we can't (realistically) generalize implicit constructors to allow a similar structural generalization.

This should illustrate why it is useful to have both proposals on the table, and possibly even adding both of them to the language.

eernstg avatar Mar 25 '24 16:03 eernstg

I meant,

bmi = calcBmi(1.64, 54.0); // compile-time error

should be errror as it is so far.

Cat-sushi avatar Mar 25 '24 20:03 Cat-sushi

bmi = calBmi(1.64, 54.0); // compile-time error

should be errror as it is so far.

Right, that's a delicate point.

I proposed using a keyword to indicate properties like assignability (so we'd have this kind of assignability with an extension type if and only if its declaration starts with the keyword open). That's definitely one way to control this property explicitly. But that proposal wasn't accepted by the language team, mainly because we shouldn't have an explosion in all the possible sequences of keywords in a declaration. (We already have stuff like abstract base mixin class ;-)

We might reconsider using a keyword, but in the meantime I'm trying to see if we can use the superinterface hierarchy to indicate openness. This happens to work quite well for an extension type E that has implements T where T is a non-extension type which is a supertype of the representation type: This means that E is a subtype of T (hence assignable to T) and also that E "inherits" all the members of T that aren't redeclared by E, and that is usually a reasonable feature set. In this issue I'm just proposing that we should add assignability in the opposite direction in the special case where T is the representation type.

If you don't want assignability (in both directions), but you do want to have all the double members (or some of them) then you could use another feature that we've considered, export:

extension type Height(double it) {
  export it;
}

void main() {
  Height h = 181.0; // Error, `double` not assignable to `Height`.
  double d = h; // Error, `Height` not assignable to `double`.
  h.floor(); // OK, all `double` members have forwarders in `Height`.
}

The point is that we want to reuse members, but we don't want to create the subtype relationship, and export will do exactly that.

eernstg avatar Mar 26 '24 11:03 eernstg

When this proposal will be accepted, please implement #2506 together.

Cat-sushi avatar Mar 27 '24 21:03 Cat-sushi

With structural assignability, and using the erasure as context type, why not just make it a proper subtype?

S <: T if T <: S, T is an extension type with representation type R and S <: R,.

Here T <: S already implies R <: S, since that's a requirement for an extension type implementing any subtype of S.

(Everything after the "if" is the definition I'd use for an extension type being transparent.)

That would give us T <: S <:> R then implying T <:> S <:> R.

We'd have all the positive effects, with much less effort.

What would the negatives be?

The extension type becomes an equivalent, but different, type to the supertype/representation type. It can have different members, but you shouldn't enter or exit the extension type unpredictably, only when actually assigning to an equivalent type.

Should the extension type be a type of interest for the supertype? That is, do we get assignment promotion? Probably not. Nor vice versa.

And the extension type still has a higher depth than its declared supertype, so they won't be crowding each other in UP calculations.

Is having a cycle on the supertype graph a problem? Shouldn't be, we already have that in other places. (But do we always normalize those to the right thing? Probably not, FutureOr<Future<Object>> and Future<Object> are equivalent and don't normalize.)

Is it a problem that a type implements a subtype of itself? Can't say for sure, I'll leave that to the type theorists, but the implements hierarchy does not have cycles, because we disallow that explicitly. We collapse some chains of implementing types into equivalence classes for subtyping, but the only thing we actually change, the extension type, must be implementing specific interfaces that are now also subtypes of it. It won't be different instantiations of those interfaces, so there should be no conflict.

lrhn avatar Mar 28 '24 13:03 lrhn

What would the negatives be?

If we specify that the representation type is assignable to the extension type when the latter is 'fully transparent' as defined above then we will allow an expression of the representation type to be passed where the extension type is required (including higher order cases like List<E> xs = <R>[someRep]).

If we introduce a full subtype relationship then we will also allow unlimited assignment from an arbitrary different fully transparent extension type with the same representation type.

We should at least be aware of that difference, and I suspect that it is too permissive. If you want that then perhaps just write all those methods as plain extension methods, rather than declaring extension types in the first place.

eernstg avatar Mar 29 '24 11:03 eernstg

I proposed using a keyword to indicate properties like assignability (so we'd have this kind of assignability with an extension type if and only if its declaration starts with the keyword open).

I agree with this. The implicit assignability is precisely what I wouldn't want for something like the BMI example.

The Weight and Height are validated types, they may implement double, but that means it's OK to convert them to a number, but not that it's OK to convert any number to a Weight or Height.

// Weight in kg.
extension type const Weight._(double _) implements double {
  const Weigth.kg(double kg) : this._(kg);
  const Weigth.g(double g) : this._(g / 1000);
}
// Height in m.
extension type const Height._(double _) implements double {
  const Height.cm(double cm) : this._(cm / 100);
  const Height.m(double cm) : this._(m);
}

double bmi(Weigth weight, Height height) => weigth / (height * height);

void main() {
  var height = Height(1.74);
  var weight = Weight(82.0);
  print("BMI: ${bmi(height, weight).toStringAsFixed(2)}");
}

Notice the mistake? I swapped the height and weight arguments. The types exist preciesely to prevent that from happening, but if we make any extension type which implements its representation type to be transparent, and allow the representation type to be assignable to the extension type, then we allow Weight <: double to be assignable to Height.

So no implicit transparency, because we'd just need an explicit way to opt out of it anyway.

Adding open would be fine. An open extension type BetterFuture<T>(Future<T> _) implements Future<T> { ...moreMethods... } is fine. It's an "extended alias".

Maybe extension typedef BetterFuture<T>(Future<T> _) { ... } which automatically implements Future<T>, and which is also a supertype of Future<T>?

lrhn avatar May 07 '24 13:05 lrhn