Proposal: don't require synthetic mixin applications to have a well-typed interface.
Consider the following code (from https://github.com/dart-lang/sdk/issues/61145):
class A {}
class B {}
class C implements A, B {}
abstract class D {
A get foo;
}
abstract class E {
B get foo;
}
mixin M implements E {}
abstract class F extends D with M {
C get foo;
}
main() {}
The analyzer accepts this code; the CFE rejects it. The CFE says:
../../tmp/proj/test.dart:11:16: Error: Class 'D with M' inherits multiple members named 'foo' with incompatible signatures.
Try adding a declaration of 'foo' to 'D with M'.
abstract class F extends D with M {
^
../../tmp/proj/test.dart:5:9: Context: This is one of the overridden members.
A get foo;
^^^
../../tmp/proj/test.dart:8:9: Context: This is one of the overridden members.
B get foo;
^^^
Based on my understanding of the spec, the CFE is correct in rejecting the code, because:
- The declaration
abstract class F extends D with Mshould be understood as syntactic sugar forabstract class F extends DwithM, whereDwithMis a synthetic mixin application, defined asclass DwithM = D with M. - The interface for class
DwithMinherits two declarations offoo: one with typeA(fromD), and one with typeB(fromM, viaE). - Since neither
AnorBis a subtype of the other, the compiler cannot assign a type tofooin the interface forDwithM.
This is a particularly frustrating situation because:
- The error can't be conveniently worked around; the CFE's suggestion to "Try adding a declaration of 'foo' to 'D with M'" is impossible.
- It's not obvious why this code needs to be rejected at all; after all, class
Fprovides a declaration offoowith a type that's compatible with the interfaces of bothDandM, and it's impossible for any expression to have a static type ofDwithM, so there's no reason the compiler would ever need to look up the type offooinDwithMat all. ~More generally, it's impossible for an expression to have a static type that is a synthetic mixin application. (The proof of this is subtle; I'll make a follow-up comment with details).~ Edit: this is wrong in some cases involving generics. See https://github.com/dart-lang/language/issues/4476#issuecomment-3113899431.
I propose that we drop the requirement for synthetic mixin applications to have well-defined interfaces, in which case the analyzer's behavior will be considered correct, and the example code above will be allowed.
CC @dart-lang/language-team
Edit: this is wrong in some cases involving generics. See https://github.com/dart-lang/language/issues/4476#issuecomment-3113899431.
Here's my proof for why it's impossible for an expression to have a static type that is a synthetic mixin application. Actually I prove a slightly stronger result; it's impossible for an expression to have a static type that is or contains a synthetic mixin application.
Assume to the contrary that C is a synthetic mixin application class, and some expression in the source code has a static type that is or contains C. (If C is a generic class, it might be C<T1, T2, ...>).
With the exception of certain special types mentioned in the spec (e.g., int), the only way for an expression's static type to contain an interface type C is if:
- It has a subexpression whose static type contains
C, - It refers to a variable whose static type contains
C, - It invokes a method or getter whose return type contains
C, - Or its static type is determined by the least upper bound algorithm.
In the first three cases, we can trace the origin of C to some other expression, variable, method, or getter whose static type contains C. If that type is implicit (i.e., determined by type inference), then it can in turn be traced back further, so if we trace back far enough, we will eventually find either an explicit mention of C or a least upper bound operation.
But if C is a synthetic mixin application class, it can't be explicitly metioned, so we will eventually trace back to a least upper bound operation.
If one of the inputs to that least upper bound operation is a type that contains a synthetic mixin application class, we can trace it back further, so we must eventually reach a least upper bound operation whose inputs are not synthetic mixin application classes.
Recall that the least upper bound of two interface types I and J is formed by first considering their sets of super-interfaces S_I and S_J, and then computing S = ({I} union S_I) intersect ({J} union S_J). S is the set of candidate upper bounds. The "least" upper bound is found by ranking the members of S by the their depth in the inheritance graph (the depth is defined as the number of steps in the longest inheritance path to Object), and then taking the member of S with the greatest unique depth.
In order for this algorithm to produce a synthetic mixin application class C, given two inputs that are not synthetic mixin application classes, C would have to be the sole member of S with depth q, and there would have to be no greater depth for which S contained exactly one member.
But this is impossible, because in the inheritance graph, every synthetic mixin application class always has a single direct subclass (either the class whose declaration caused that mixin application to be synthesized, or the next step in the chain of synthetic mixin applications, in the case where there are multiple types named in the with clause). Therefore, if C is the sole member of S with depth q, then this direct subclass must be the sole member of S with depth q+1, contradicting the assumption that the least upper bound operation would yield class C.
GitHub really needs a 🤯 reaction for your follow-up comment. That all sounds right to me.
One thing to worry about is mixin application class canonicalization.
I believe some backends conflate mixin application classes from different locations, if they are the same mixin on the same class.
class B with M1, M2 {}
class C with M1, M3 {}
If Object with M1 gets canonicalized to the same class, then UP(B, C) should be that class.
I was going to say that I agree otherwise ... but LUB is stoopid. So I think it's just not true in general.
mixin Baddy {}
class Base {}
class Helper<T> extends Base with Baddy {}
class Sub1 extends Helper<int> {}
class Sub2 extends Helper<double> {}
Then UP(Sub1, Sub2) should be Base & Baddy, because there is no other shared superinterface with depth 2, and no shared superinterface with higher depth, and Base and Baddy both have depth 1 giving Base & Baddy depth 2. The other superinterfaces differ in instantiation, which we only try to account for in the first level of UP, but not when falling back on LUB.
The bug in the proof is:
Therefore, if
Cis the sole member ofSwith depth q, then this direct subclass must be the sole member ofSwith depth q+1
A single direct subclass does not imply a single interface.
EDIT: Added Base class to avoid Baddy and Object & Baddy both having depth 1.
I have proposed, and stand by, that we should completely ignore anonymous mixin application classes in LUB, including not counting them for depth.
(I did mean "LUB", fixed.)
One thing to worry about is mixin application class canonicalization.
I believe some backends conflate mixin application classes from different locations, if they are the same mixin on the same class.
class B with M1, M2 {} class C with M1, M3 {} If
Object with M1gets canonicalized to the same class, then UP(B,C) should be that class.
I think that mixin class canonicalization by the back ends shouldn't affect things, because LUB is always done in the front end. But I could be wrong about that.
I was going to say that I agree otherwise ... but LUB is stoopid. So I think it's just not true in general.
mixin Baddy {} class Helper<T> with Baddy {} class Sub1 extends Helper
{} class Sub2 extends Helper {} Then UP( Sub1,Sub2) should beBaddy, because there is no other shared superinterface beforeObject. The other superinterfaces differ in instantiation, which we only try to account for in the first level of UP, but not when falling back on LUB.The bug in the proof is:
Therefore, if
Cis the sole member ofSwith depth q, then this direct subclass must be the sole member ofSwith depth q+1A single direct subclass does not imply a single interface.
Well shoot, you're right. Good catch. I'll add links to your comment to the issue description (and my proof) for the benefit of others reading this issue.
I have proposed, and stand by, that we should completely ignore anonymous mixin application classes in LIB, including not counting them for depth.
Assuming that by "LIB", you mean LUB 😃... Yeah, I think that would be a good solution for this issue.
I would also be cautious about super mixins. This program would break soundness if allowed. It is currently rejected by both the analyzer and the CFE for different reasons, but the error that the CFE gives is the one that this issue proposes to eliminate, I think.
class A {}
class B {}
class C implements A, B {}
abstract class D {
void set foo(A a) {}
}
abstract class E {
void set foo(B b);
}
mixin M1 implements E {}
mixin M2 on E {
void test() {
super.foo = B();
}
}
class F extends D with M1, M2 {
void set foo(Object o) {}
}
void main() {
F()..test();
}
That unsoundness of super-member-invocations is not new, and we already deal with it.
The M2.test's super.foo is valid, type-checked against the on type's member signature, but the mixin application D with M1, M2 would not be, because you can only apply M2 to a class which has an implementation of foo which is a valid implementation of the signature of E.foo.
Since the implementation of foo of D with M1 has function type void Function(A), which is not a valid implementation of E.foo, the mixin application is disallowed.
Super-member-invocations do not use the interface signature of the superclass's interface, but the actual implementation. It's usually the same, but you can skip an interface that isn't implemented.
abstract interface class I {
void foo(num n);
}
class A {
void foo(int i) {}
}
abstract class B extends A implements I {}
class C extends B {
void bar() {
// The argument type 'double' can't be assigned to the parameter type 'int'.
super.foo(3.14);
}
void foo(num n) {}
}
mixin M on I {
void bar() {
super.foo(3.14);
}
void foo(num n) {}
}
// The super-invoked member 'foo' has the type 'void Function(num)',
// and the concrete member in the class has the type 'void Function(int)'.
class D = B with M;
For mixin super-member-calls, we can't check at the call point, so we remember every member which is super-invoked, type checked against the combined on-type interface, and then check for every mixin application that it has an implementation
of every super-member-called member which is a valid implementation of the combined on-type member signature.
We've got this issue covered, because we're not looking at the interface for super-invocations, and when we are (mixin applications), we require the implementation to satisfy the on-type interface, which must have a consistent convinced member signature. The actual interface member signature of the superclass doesn't matter for that.
First, I think we should avoid specifying that any class can have an interface that violates the common well-formedness rules. In particular, we shouldn't allow D with M in the original example because it is a class that has two superinterfaces such that there is no combined member signature for foo, and we generally refuse to ignore or resolve those conflicts implicitly. Here is one way it could be mitigated:
class A {}
class B {}
class C implements A, B {}
abstract class D {
A get foo;
}
abstract class E {
B get foo;
}
mixin M implements E {}
mixin Mitigate {
C get foo;
}
abstract class F extends D with Mitigate, M {}
void main() {}
This may not be very elegant, but it's possible, it's expressed locally at the declaration that has a problem, and I doubt that the issue will come up very frequently.
The reason why we might want to ignore some well-formedness violations in anonymous mixin applications would be that they can never be the type of an expression. However, that's building on sand because it implies that we can never introduce any generalization whereby it will be possible for a mixin application to be the type of an expression (say, if all subtypes are private to some other library, or whatever). So I'd very much prefer that we don't paint ourselves into that corner. ;-)
Next, here's an interesting twist on the computation of the UP using the old Dart 1 algorithm. Consider the following program:
// Support for inspecting static types.
typedef Exactly<X> = X Function(X);
extension<X> on X { X expectStaticType<Y extends Exactly<X>>() => this; }
// Example superinterface graph.
class A {}
mixin M {}
class B extends A with M {}
class C {}
class D extends C {}
class E extends D {}
class F1 extends B implements E {}
class F2 extends B implements E {}
// Test `UP` on this graph.
void f(bool b, F1 f1, F2 f2) {
var x = b ? f1 : f2;
x.expectStaticType<Exactly<D>>();
}
void main() {}
The superinterface structure is as follows:
graph BT;
Object --> Object?
A --> Object
M --> Object
B --> AWithM["A & M"]
AWithM --> A
AWithM --> M
C --> Object
D --> C
E --> D
F1 --> B
F1 --> E
F2 --> B
F2 --> E
The fun part is that both the analyzer and the CFE accept the program above with no errors, which shows that they consider D to be the deepest node which is alone at the given level.
In other words, both tools ignore the anonymous mixin application A & M when they execute the last case in UP (that is, the Dart 1 algorithm). They do not ignore it when they're counting the depth of each class, but they do ignore it when they are counting how many nodes there are at each level. This may be done consistently or otherwise, but it is certainly the behavior that we can observe for this particular example.
We could choose to change the specification accordingly and keep the behavior of the CFE (and adjust the analyzer to work in the same way).
In contrast, it would probably be a bug-prone and subtle breaking change to ignore the mixin application both when the depth is computed and when we're searching for the deepest node which is alone at its level.
We could also change the implementations to follow the current specification (such that no classes are ignored when it is counted how many nodes there are at each level). This would be a breaking change, too, but it seems more consistent.
In the concrete example, x would then get the type Object. In general, this change would make the inferred type of conditional expressions more general, and a context type could be used to re-enable the type that the expression used to have.
I'm not sure why ignoring the anonymous mixin application class in the depth calculation would be more error prone than anything else.
I think it's more confusing, and potentially assumption breaking, to have a hole in the inheritance chain. The fact that LUB can yield a non-trivial type for two types with no shared superinterface other than Object is definitely surprising.
Lowering the depth of subtypes dominated by an abstract muffin application class will ensure that there are no such gaps.
I'm not sure why ignoring the anonymous mixin application class in the depth calculation would be more error prone than anything else.
I'm comparing two situations: (1) keep the current behavior in the CFE and adjust the analyzer to match, vs. (2) change the algorithm such that anonymous mixin applications are skipped both when computing the depth of any node in the graph and when counting the number of nodes in each level (this implies changes to the CFE as well as the analyzer). I'm simply saying that (1) is safer than (2) because (1) will definitely not break existing programs, and (2) will break some at compile time and others at run time.
If you consider the example I gave then (2) would cause x to have type E rather than D, which could change the type arguments of a List (or some other class whose signature has non-covariant occurrences of a type variable in a member signature), and that could give rise to run-time errors at some point later during execution. The program will most likely compile just fine after this change, and it can be quite tricky to try to track down why it's now a List<E> rather than a List<D>, when the source code hasn't been modified at all, and the type error is thrown in a location that has nothing to do with the location where the actual new behavior occurred.
From a conceptual point of view, you could claim that it's a "cleaner" approach to skip the anonymous mixin application both when computing depths and when counting nodes at each depth, but I doubt that the reliance on depth is sufficiently meaningful to developers in the first place to justify a breaking change that may silently cause your programs to start crashing.
However, I'd consider the approach where anonymous mixin applications are fully included as "cleaner" than both of those. I don't see any conceptual justification for claiming that a class which has been created by an abstraction (a mixin application) should be considered less first-class than a class which has been created by writing a new class body (which could also be called a "mixin literal").
Would you also want tear-offs like print or myReceiver.aMethod to be anything less than a full-fledged function object, and only function literals should be first-class? I think we should strive for semantic entities (like functions or classes) to have first-class status consistently if any syntactic forms have first-class status.
Nevertheless, I'd recommend that we simply keep the current semantics and adjust the specification and the analyzer to match. That is, we should use approach (1).
The fact that LUB can yield a non-trivial type for two types with no shared superinterface other than Object is definitely surprising
I don't see that happening. For this algorithm (not for LUB in general, but for the Dart 1 algorithm which is used in the last case), the chosen type will always be a superinterface of both of the operands. It may or may not be Object or Object?, but it will be a shared superinterface.
(For LUB in general we just know that the result will always be a shared supertype, and this might be a non-superinterface for both operands, e.g., UP(List<int>, List<double>) == List<num>.)
If we consider a third model which is to (3) fully implement the specified algorithm (that is, an anonymous mixin application is treated exactly like any other class) then this example from @lrhn shows that an anonymous mixin application can be a perfectly meaningful result for UP to return. It is a shared superinterface for two types, it may contribute additional members that we can call, and it may contribute additional supertypes. It is also conceptually a superinterface, even though we can't denote it explicitly. So I definitely think the full implementation of the already specified behavior would be the cleanest approach.
Nevertheless, I'd still recommend that we keep the behavior (1), simply because it isn't worth the trouble to introduce a breaking change for this purpose.
Edit: Removed the part about UP never returning an anonymous mixing application, because this is a counterexample.
However, I'd consider the approach where anonymous mixin applications are fully included as "cleaner" than both of those.
- It is. It's just not particularly useful. If a user ever ends up inferring the type of something as the type of an anonymous mixin application class, they can't use that for anything.
(But then, I also suggest ignore all inaccessible declarations in the depth computation used by LUB. Splitting parts of a class into a private base class should be an invisible change, and yet it's potentially breaking. I want a language where that isn't visible outside of the declaring library in any way. If the superinterfaces of a class change, but the only changes are to types that you can't and shouldn't see, then it shouldn't be visible. If we can't find a better LUB, maybe we can at least omit the types that the user shouldn't know about. Then I'd consider anonymous mixin application classes as inaccessible everywhere, even in the same library. Because they are --- you can't name them.)
they can't use that for anything
They can call members declared by the mixin, they can assign it to types that are supertypes of the mixin, all the usual stuff. The only thing they can't do is to write a term that denotes this type, but even that doesn't stop this type from being the inferred type of a local variable that has no declared type. I can't see why that would be considered to be 'nothing'.