language icon indicating copy to clipboard operation
language copied to clipboard

How should name collision in tuples be handled.

Open lrhn opened this issue 5 years ago • 12 comments

The current proposal for tuples introduces a field1 getter for a positional entry and a name getter for a named entry named name. It's made a compile-time error to have a named entry with the same name as a member of Object (toString, runtimeType, noSuchMethod, hashCode).

It's not defined what happens if a named entry has the name field1 and there is also a positional entry which would get that name, It should probably also be an error. However, all these restrictions may interfere with uses of tuples. A function can have a named parameter named toString, and if you can't create a tuple with a named entry with that name, then you can't spread a tuple as all the arguments of a function call of that function. You can't properly abstract over function arguments if the tuples are restricted in names in ways that function parameters are not.

The fieldX names themselves may be open to discussion. The language specification doesn't use the name "field" (it's "instance varibles"), so hard-coding the term into the language seems a little odd. Any name can cause a problem

Should we do something else than naming the positional entries? Should we allow named entries with any name, and provide access to them?

lrhn avatar Nov 06 '20 16:11 lrhn

My proposal: Allow the [] operator to be used on expressions of a tuple type. The argument must be a literal (possibly just a constant, but I'm not sure even that's worth it). It's not a method invocation, it's special language syntax for projecting from a tuple. The operand must be either an integer or a symbol. You can also access named entries by .name as long as they're not conflicting with members of Object. So:

(int, int, {int x, int hashCode}) p = (0, 1, x: 2, hashCode: 3);
print(p[0]); // 0
print(p[1]); // 1
print(p.x); // 2
print(p[#x]); // 2
print(p[#hashCode]); // 3

Again, there is no [] member on the "interface" of tuples. It's a language level operation you can perform on expressions of tuple type, and because the operand is a literal, and the tuple type is known (since the expression has a tuple type), the expression can be given the correct static type.

You can use .name for named entries, and you can fall back on [#name] in case of conflicts.

(This means that tuple types do occupy the [] operator, so extensions cannot add operator[] to a tuple type, just as you can't add call to Function or anything to dynamic or Never).

This gives a readable projection syntax. It requires named members to be added for positional entries, which avoids one source of name conflict, and it provides a canonical way to access any entry, as well as a convenient way for unconflicted named entries.

The DestructureX interfaces would have no reason to exist.

lrhn avatar Nov 06 '20 17:11 lrhn

Looks good!

However, it won't take long before we get this request:

(int, int, {int x, int hashCode}) p = (0, 1, x: 2, hashCode: 3);
for (var index in [0, 1, #x, #hashCode]) print(p[index]);

We could handle this by a compile-time loop unrolling, but I guess the answer will be "Just say no!". ;-)

eernstg avatar Nov 06 '20 17:11 eernstg

It's not defined what happens if a named entry has the name field1 and there is also a positional entry which would get that name, It should probably also be an error.

Yeah, I think I made that an explicit error in some draft of the proposal but accidentally forgot it. I think this should be an error.

A function can have a named parameter named toString, and if you can't create a tuple with a named entry with that name, then you can't spread a tuple as all the arguments of a function call of that function.

This is a fair point, but I wonder if the complexity needed to support fully generalizing over all possible parameter lists carries its weight.

It's not a method invocation, it's special language syntax for projecting from a tuple. The operand must be either an integer or a symbol.

I think using the [] syntax for this would cause an unending series of bug reports from very confused users who don't understand why the operator works fine with computed indexes/keys on lists and maps, but not records.

I get the principle of not wanting to have certain member names treated as specially by the language, but I think adding a whole separate kind of "record field access" syntax and semantics to the language feels like a lot of mechanism for not a lot of value in return. We don't give you any way to define a method named call() in a class without that implicitly letting the class be used a function. Likewise, if I remember right, before Dart 2.0 you could run a for-in loop on anything that had a .iterator member, regardless of whether it declared it implemented Iterable. Neither of these seemed particularly problematic to me.

munificent avatar Nov 06 '20 19:11 munificent

(For the record, I have always wanted you to use operator () or operator call instead of just call to make an object callable.)

lrhn avatar Nov 06 '20 23:11 lrhn

@munificent: could you please explain your preference for the name fieldN over the more traditional $N?

Traditional according to what tradition? Using $ in an identifier is always dubious in Dart because it makes string interpolation really confusing. In practice, the only places I see $ used is in generated code, specifically to send a signal that "this is weird, don't use it".

Kotlin uses componentN, which is kind of long but I like that it's human-friendly.

I would have considered _n, but unfortunately the leading underscore is already taken in Dart to mean privacy.

(For the record, I have always wanted you to use operator () or operator call instead of just call to make an object callable.)

Me too. But not doing that hasn't seemed to have caused us much pain.

munificent avatar Nov 09 '20 18:11 munificent

at least it's used in sh. Kind of :-)

In my book, that's a signal to not do something. :)

I like .n notation. It's short and not confusing

I kind of like that and have idly considered it. It has at least one edge case ambiguity:

extension on (String, String) {
  method() {
    print(.0); // Oops! double literal not a field access on implicit this.
  }
}

But, overall, I think it's kind of strange looking and probably not worth adding new syntax for.

If this is not acceptable, then maybe C#'s variant .itemN is not that bad?

That's OK too. Another option is elementN.

I don't see anything wrong with [n] notation either. If n is a variable, there will be a runtime check. How is this case different from a list subscript? Why more "compile-time safety" is required here?

var record = ("str", 3);
var i = isTuesday ? 0 : 1;
var field = record[i];

What is the static type of field?

munificent avatar Nov 12 '20 01:11 munificent

```dart
var record = ("str", 3);
var i = isTuesday ? 0 : 1;
var field = record[i];

What is the static type of field?

I'm actually somewhat interested in this syntax, and I would say that a[i] is a static error if a has a tuple type, and i is not an integer literal in the appropriate range. You to then need to decide whether (record as dynamic)[i] is a runtime error, or whether it evaluates (slowly) to the ith field. I think either approach is valid.

leafpetersen avatar Nov 12 '20 03:11 leafpetersen

The question about the least upper bound of two tuple/record types is basically determined by the subtype structure (cf. #1278): If we have DestructureN types then there's subtyping among different shapes (width subtyping), but it's allowed to drop all named fields and some positional fields, and nobody has proposed that we should consider width subtyping for subsets of the named fields. So LUB((int, {int foo}), (int, {int foo, int bar}) == Destructure1<int>, and only the positional field can be looked up. For the (int, int) and (double, double) case we'd use covariance and get Destructure2<num, num>. I think the main question here is the underlying subtype structure, and then the least upper bounds and the questions about accessible members will follow.

For the treatment of (myRecord as dynamic)[i] or (myRecord as dynamic)[mySymbol], it could be claimed that it is consistent with much of Dart to perform the lookup at run-time (possibly slowly), but we could also note that this amounts to a dynamic member lookup operation, and that is not something which is available without reflection.

Hence, it seems consistent to require for e1[e2] where the type of e1 is a subtype of Record that e2 is a constant expression of type int or Symbol, and the type of e1[e2] is then the type of that component — and e1[e2] will throw at run time if e1 is a record with static type dynamic.

eernstg avatar Nov 12 '20 08:11 eernstg

For LUB, I'd go with LUB(t1, t2) where t1 and t2 are tuple types as:

  • If t1 and t2 have the same structure (same number of positionals, same names) then do pointwise LUB on each position
  • otherwise the result is Record.

I don't want the DestructureX interfaces, and if they interfere with LUB, it becomes even more confusing. If you do:

var p1 = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, foo: 42);
var p2 = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, bar: 42);
var p = test ? p1 : p2;  // infer Destructure10<int, int, ..., int>.
print (p.field0); // prints "1";

then it works, but if you then add an element to p1:

var p1 = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, foo: 42);

then there is no Destructure11 interface and it has to either just go to Record or stay at Destructure10. The latter is only possible if we make every tuple with >10 positional elements implement Destructure10 (which probably means making Destructure10 <: Destructure9 <: etc. Which I'm not sure I want either, even though it's obviously possible). If we change the access to p[0] instead, then the DestructureX interfaces won't help for member access, you need a tuple type.

lrhn avatar Nov 12 '20 11:11 lrhn

@tatumizer It's one way of thinking, but it requires (int, int, {int foo, int bar}) to be a subtype of (int, {int foo}). Subtyping means assignability. (Or at least it requires assignability since we're assigning (0, 1, foo: 5, bar: 5) to a which will have that type, we might avoid the subtyping by doing a coercion at the assignment).

Because of assignability, we should be able to do the un-conditional assignment (int, {int x}) p = (1, 1, bar: 1, foo: 1);. Throwing away values like that is very likely to hide errors. I want to not allow that, for the same reasons I don't want to allow over-applying functions. Example:

void foo(int x, {int y = 0}) => ...; 
foo(1, 1, foo: 0, bar: 0);

That looks like, and probably is, an error. We do not allow it. Tuple assignment is very closely related to parameter passing, so having significantly different behavior is a design smell.

For that reason, the current specification does not have any subtype relations between tuple types based on common sub-structure.

lrhn avatar Nov 12 '20 19:11 lrhn

I'm actually somewhat interested in this syntax, and I would say that a[i] is a static error if a has a tuple type, and i is not an integer literal in the appropriate range.

My earlier comment on this was: I think using the [] syntax for this would cause an unending series of bug reports from very confused users who don't understand why the operator works fine with computed indexes/keys on lists and maps, but not records.

For LUB, I'd go with LUB(t1, t2) where t1 and t2 are tuple types as:

  • If t1 and t2 have the same structure (same number of positionals, same names) then do pointwise LUB on each position
  • otherwise the result is Record.

That's what the proposal states too.

Here the type of a is computed as the type of "biggest common substructure". I think this is a logically consistent way of thinking.

You're correct. It's logically consistent and sound. In other languages, this is called "width subtyping" or "row polymorphism". However, I think it's looser than we want and is more likely to mask bugs than do anything useful. The goal of a type system isn't necessarily to statically allow all operations that would not fail at runtime. It's to help users write the code they intend to write and help them not write the code they don't intend to write.

munificent avatar Nov 13 '20 00:11 munificent

I went ahead and removed the synthesized getter names for positional fields from the proposal. This means that currently the only way to access positional fields is using pattern matching and destructuring. I'll re-open this issue to talk about potentially adding a new syntax for accessing those fields.

munificent avatar Jun 27 '22 21:06 munificent

...and there is a newer issue to discuss this that I think has more up-to-date conversation, so I'll close this one in favor of that: #2388.

munificent avatar Aug 19 '22 00:08 munificent