language Records: zero length and unary records

The records proposal forbids empty records and records with a single entry:

A record may have only positional fields or only named fields, but cannot be totally empty. There is no "unit type". A record with no named fields must have at least two positional fields. This prevents confusion around whether a single positional element record is equivalent to its underlying value, and avoids a syntactic ambiguity with parenthesized expressions.

There has been discussion in the past of trying to exploit the symmetry between parameter lists and records. I'm slightly worried that this restriction may come back to bite us, since it prevents (e.g.) uniformly reifying argument lists as records. Is it possible to make this a restriction only on the literal syntax? That is, semantically we still have zero length and unary records, we just have no literal syntax for them?

As a second step (or an alternative step) we could now or in the future add alternative syntax that generalized fully. For example, I can imagine:

Adding a static constant Record.unit on Record which is the unique zero length record.
Adding a unary constructor Record.single on Record which produces a unary tuple
Or generalizing fully and say that there is a "magic" n-ary constructor on Record such that Record(....) produces the record corresponding to the literal record syntax (...), with the additional generality that ... may be empty or a single positional argument.

cc @munificent @lrhn @eernstg @natebosch @jakemac53 @stereotype441

Aug 05 '22 23:08 leafpetersen

Unary records used to be in the proposal and I took them out. They caused some disagreement around whether a unary record is isomorphic to its field or not. Likewise, is the zero-length record null, or something different? Since we won't have argument list reification in the first release anyway, I figured the safest approach is to not support nullary and unary positional records at all.

I anticipate adding them in a future release when we add support for argument list reification and spreading.

If you think it's worth adding them now, I think we can make it work. Python uses a trailing comma to disambiguate parenthesized expressions and one-field tuples:

var number = (1);
var record = (1,);

It's a little funny looking, but could work. I like your suggestion of just exposing them through an API.

We do have to think about pattern matching too, though. The current proposal does allow parentheses for grouping in patterns, which would be ambiguous with a unary record pattern.

Aug 05 '22 23:08 munificent

They caused some disagreement around whether a unary record is isomorphic to its field or not. Likewise, is the zero-length record null, or something different?

I think the answer to both of these would clearly have to be no. Life gets very squirrely if you say yes, and I see no benefit to doing so.

If you think it's worth adding them now, I think we can make it work.

I think it's at least worth making it explicit that the runtime set of values should be considered to include unary and nullary records, even if we don't add syntax for introducing them (to avoid implementations building in assumptions that become problematic in the future).

Python uses a trailing comma to disambiguate parenthesized expressions and one-field tuples:

This would be fine. Maybe even (,) for the empty tuple? The nice thing about that syntax, is it strongly discourages anyone from actually using it... :)

I like your suggestion of just exposing them through an API.

If we do nothing else, I'd probably suggest doing that, at least so we can test them. Though I guess we could keep it private for now.

Aug 06 '22 00:08 leafpetersen

See also previous discussion here. Apparently I've started to repeat myself. And say the same thing multiple times as well.

Aug 06 '22 00:08 leafpetersen

We could just do it later: The zero and one component records could be part of a future enhancement about spreading tuples into actual argument lists, as long as we make sure those records are a syntax error. That's true for (,), and probably for (e,) (if it can't be parsed as an actual argument list itself).

Aug 06 '22 06:08 eernstg

I don't know if this was considered but couldn't you also use types to differentiate between a parenthesized expression and a unary record?

int number = (1);
(int) record = (1);

A similar thing is currently done to differentiate maps and sets.

Map map = {};
Set set = {};

Aug 16 '22 05:08 mmcdon20

Using types to differentiate between ambiguous syntactic constructs is possible, but comes with a cost of, well, ambiguity. We still need to give a meaning in the case where there is no context type. That'll probably be the existing parenthesized expression meaning. And authors need to be absolutely sure which context type they have, and make sure it doesn't change, because otherwise their code might stop compiling. Or worse: Compile and do something else.

Imagine someone wrote extension methods on tuples, like:

extension Await2<S, T> on (Future<S>, Future<T>) {
  Future<(S, T)> get wait => 
      Future.wait([this.0, this.1]).then((list) => (list[0] as S, list[1] as T);
}

// For completeness.
extension Await1<S, T> on (Future<S>) {
  Future<S> get wait => this.0;
}

You'd think that (future).wait would work, but because receivers have no context type, you're just doing future.wait. (Silly example, I know. But something like that is bound to happen.)

At that point, you'd need a way to create a singleton tuple, and you don't have a context type, so you're stuck. That's why we want an explicit syntax, even if it's (e,) (which is consistent with allowing trailing commas in lists, because records are lists of expressions, and parenthesized expressions are not).

(I'd still like the implicit type-based conversion, but I know other people in the language team are more wary about dding those, and with good reason.)

Aug 16 '22 08:08 lrhn

If we have unary and nullary tuples, we also need to have types for them.

The most consistent type syntax would probably be (int) and (). In a type position, those are unambiguous (anything starting with ( in a type position is a record type). So, probably less of a problem than the record literal syntax.

(I still think using null as () is a somewhat reasonable thing to do. The value null represents "no value". The zero-product of value types represent no value. It's the same thing! The problematic issue is that if we make () <: Record, then Null <: Record <: Object. An option is to make null implement both Null and (), without the types being the same. I'm sure some code will still get really confused if null is Object. — For the record, I now think we should have made null <: Object in null safety.)

Aug 16 '22 08:08 lrhn

language language copied to clipboard

Records: zero length and unary records

language
language copied to clipboard