language Primary constructor on classes

Primary constructor on classes

Open leafpetersen opened this issue 1 year ago • 72 comments

[Update: There is considerable interest on the team in adding primary constructors as a general feature. This original discussion issue has been repurposed as the tracking issue for the general feature request.]

Introduction

Primary constructors is a feature that allows for specifying one constructor and a set of instance variables, with a concise and crisp syntax. Consider this Point class defined using the current class syntax for the constructor and fields:

class Point {
  int x;
  int y;
  Point(this.x, this.y);
}

With the primary constructor feature, this class can defined with this much shorter syntax:

class Point(int x, int y);

Discussion

[original issue content below]

In the proposal for structs and extension structs, I propose to add primary constructors to structs. Briefly, the class name (or the type parameter list if any) may/must be followed by a parenthesized list of variable declarations as such:

  struct MyStruct(int x, int y) {
      // members here
  }

In the struct proposal, these are always final by default, and are restricted in various ways (i.e. they may not be late, they may not be const). They are allowed to declare initializers, which are used to generate default initialization values.

This issue is to discuss the possibility of splitting this out, and making it a general feature for classes as well.

Initial points in favor of this include:

It would be consistent, and nice to have this available for classes
It makes structs less different from classes

Initial points against include:

Resolving the tension between the desire to have final by default for structs, and the existing mutable by default behavior in classes.
Classes support richer superclass structure that may make specifying how the generated constructor works more complicated
Dealing with const constructors here seems more complicated than in the data class case.

Jul 29 '22 22:07 leafpetersen

cc @mit-mit @lrhn @eernstg @chloestefantsova @johnniwinther @munificent @stereotype441 @natebosch @jakemac53 @rakudrama @srujzs @sigmundch @rileyporter @mraleph

Jul 30 '22 00:07 leafpetersen

I'd say "yes".

If it works, it feels odd to not allow it. I can't see a reason it shouldn't work. It's a little weird that it only works with final instance variables, but I think that's acceptable.

But, that depends a lot on what model/capabilities we end up with for the fields.

The current model uses the primary constructor to declare fields, and only secondarily as a template for a default constructor. That is, it's not necessarily a "constructor". You can also write other constructors, and initialize fields directly in those.

We could consider a different model where the "primary constructor" directly defines the unnamed constructor. That means changing the (...) to a parameter list instead of a list of field declarations, and then deriving the field declarations from the parameters, instead of the other way around. (And then you can perhaps even extend another non-abstract struct, and forward parameters using super.foo syntax.) Otherwise people will need to write a constructor anyway, when they want it to have named parameters. I think that'd be a lost opportunity for a shorthand syntax for classes, where you don't need to write a constructor directly.

With that, I'd also say that any other generative constructor must be redirecting (eventually) to the primary constructor. Since the primary constructor initializes all fields. (Or, we can allow adding a name, class Foo._(args) {}, to make the primary constructor be a named, possibly private, constructor.)

If we do allow the syntax for classes too, I'd also have those classes get the default == and hashCode implementations (if they don't inherit or declare an implementation of either other than the one from Object). That makes sense because the primary fields are known. (Not sure that works if you extend another kind of class, though, because you can't just delegate to super.==).

A class declaration with a primary constructor can add further instance variables, but they must be self-initializing (nullable or having an initializer). Those other variables would also not be part of the == or hashCode implementation. (If you need that, you must declare the ==/hashCode yourself.)

Jul 30 '22 12:07 lrhn

Yes, please! ;-)

We could use the following rule: The primary constructor of a struct declares non-late instance variables that are final by default. The primary constructor of a class declares non-late instance variables that are non-final by default, but may have the keyword final.

Classes can still declare additional instance variables, of any kind, as before. Structs might be able to do this as well—what's the harm in allowing a struct to have final int now = DateTime.now().millisecondsSinceEpoch;?

The primary constructor syntax gives rise to a constructor declaration and a set of instance variable declarations. Every constructor is checked statically relative to the result of this desugaring step.

In other words, there is no reason to require that all constructors are redirected to the generated primary constructor, they just need to satisfy the normal constraints that we have today (e.g., that a non-late final instance variable must be initialized before any code with access to this runs).

About this:

changing the (...) to a parameter list instead of a list of field declarations, and then deriving the field declarations from the parameters, instead of the other way around

I think that's a very interesting idea to explore.

Aug 01 '22 14:08 eernstg

With that, I'd also say that any other generative constructor must be redirecting (eventually) to the primary constructor. Since the primary constructor initializes all fields.

Not sure I follow this. I was proposing to allow other generative constructors, they just must also initialize all of the fields as usual.

(Or, we can allow adding a name, class Foo._(args) {}, to make the primary constructor be a named, possibly private, constructor.)

On reflection, I was thinking of modifying the proposal to say that if there is a "primary constructor", then there is always a Foo._ constructor generated, and if there is no explicit default constructor, then one is generated that redirects to ._.

In other words, there is no reason to require that all constructors are redirected to the generated primary constructor, they just need to satisfy the normal constraints that we have today (e.g., that a non-late final instance variable must be initialized before any code with access to this runs).

This was the model I had in mind.

Aug 01 '22 20:08 leafpetersen

I totally agree that this syntax should be applicable to classes as well. You can make an argument that it is good enough if it only supports simple cases, e.g.

class X (var x, int y, final String z, {super.key}) extends Y {
  final int w = x + y;
}

// equivalent to 

class X extends Y {
  var x;
  int y;
  final String z;
  final int w;
  X(this.x, this.y, this.z, {super.key}) : w = x + y;
}

and for anything else people can resort to traditional constructor syntax.

You can probably make const work as well if you say that class with a primary constructor of form (final f, ..., final z) (all fields final) automatically gets const constructor.

We can maybe even make something like:

class X (var x, int y, final String z, {key}) extends Y(x, key: key) {
  final int w = x + y;
}

work.

The weakest point of this is going from shorthand syntax to long syntax once you realise that you need constructor body, but maybe this rarely happens.

Aug 02 '22 13:08 mraleph

I 1000% want any primary constructor syntax to be generalized to classes. In fact, I personally care more about that than I care about the entire views proposal. :) Most user-types do not have value semantics (==, hashCode) but do have simple enough constructors that they could use this syntax.

To validate that, I scraped a big corpus of code from itsallwidgets.com, open source Flutter apps, and pub packages (18+MLOC) to determine which classes with at least one generative constructor could not use a proposed primary constructor syntax. The reasons a class might not be able to use a primary constructor sugar that I considered are:

"Multiple generative ctors": There are multiple generative constructors. In practice, many of these classes could still probably use a primary constructor for one of them, but these were rare enough that I didn't bother trying to distinguish them. So consider the results below a slight undercount.
"Non-empty body": The constructor body isn't empty.
"Non-forwarded superclass ctor param": The constructor has a super() constructor initializer that passes an argument that isn't simply a forward from a constructor argument. (In other words, there's a superclass constructor argument that couldn't use super. instead.)
"Non-forwarded field initializer": The constructor has a field initializer that isn't simply a forward of a constructor parameter. (In other words, there's a constructor initializer that couldn't use this. instead.)

The results are:

-- Could use primary (109448 total) --
  82826 ( 75.676%): Yes
   9254 (  8.455%): No: Non-forwarded superclass ctor param
   7172 (  6.553%): No: Non-empty body
   4568 (  4.174%): No: Multiple generative ctors
   3859 (  3.526%): No: Non-forwarded field initializer
    935 (  0.854%): No: Non-empty body, Non-forwarded field initializer
    442 (  0.404%): No: Non-empty body, Non-forwarded superclass ctor param
    338 (  0.309%): No: Non-forwarded field initializer, Non-forwarded superclass ctor param
     54 (  0.049%): No: Non-empty body, Non-forwarded field initializer, Non-forwarded superclass ctor param

So a little more than 3/4 of all existing class declarations could use something close to the proposed primary constructor syntax. Note that I'm assuming here that a primary constructor syntax would support users controlling which parameters are positional, named, and/or optional and would allow private names. (In other words, I did not treat those as failures.)

I think the biggest design challenges are:

Whether fields should default to final or not. It's pretty obviously the right default for struct, but I think would be surprising for class. (Ideally, we would have always defaulted to immutable for fields and parameters, but that ship has sailed.)
How to make the syntax readable when there are many fields, doc comments, extends clauses, implements, with, etc. It can get pretty hairy to pack all of that into the header of a class.

A while back, I worked on a primary constructor strawman syntax that looked like:

class Rect new (
  final int x,
  final int y,
  final int width,
  final int height,
);

So instead of a parameter list right after the class name, there is a new keyword first. Having a keyword there allows a few things:

You can put it after the other header clauses. Since the field list is likely longer than the extends clause, type parameters, etc. I think it looks best last right before the class body, as in:

class ArgumentSublist extends Rule<Expression> implements FormatSpan new (
  /// The full argument list from the AST.
  final List<Expression> _allArguments,

  /// The positional arguments, in order.
  final List<Expression> _positional,

  /// The named arguments, in order.
  final List<Expression> _named,
) {
  /// The number of leading block arguments, excluding functions.
  ///
  /// If all arguments are blocks, this counts them.
  final int _leadingBlocks;

  /// The number of trailing blocks arguments.
  ///
  /// If all arguments are blocks, this is zero.
  final int _trailingBlocks;

  void visit(SourceVisitor visitor) { ... }
}

Compare that to what you'd get using the current proposal:

class ArgumentSublist(
  /// The full argument list from the AST.
  final List<Expression> _allArguments,

  /// The positional arguments, in order.
  final List<Expression> _positional,

  /// The named arguments, in order.
  final List<Expression> _named,
) extends Rule<Expression> implements FormatSpan {
  /// The number of leading block arguments, excluding functions.
  ///
  /// If all arguments are blocks, this counts them.
  final int _leadingBlocks;

  /// The number of trailing blocks arguments.
  ///
  /// If all arguments are blocks, this is zero.
  final int _trailingBlocks;

  void visit(SourceVisitor visitor) { ... }
}

Note how the extends and implements clauses are buried in the middle.

You can use different keywords. In my strawman, you could use const instead of new to make the primary constructor a const constructor. We could also allow you to use final to default to making all fields final. So the first example becomes:

class Rect final (
  int x,
  int y,
  int width,
  int height,
);

We could then do the same thing for struct which would allow you to define value types with mutable fields. (Which are, admittedly, dubious, but a thing users do in practice.)

In other words, this means the only thing writing struct instead of class does is give you default implementations of ==, hashCode, etc.

You can provide a constructor name. Having a keyword before the parameter list instead of the class name also provides a natural place to insert a constructor name if you want the primary constructor to be named:

class NestingLevel extends FastHash new.empty(
  /// The nesting level surrounding this one, or `null` if this is represents
  /// top level code in a block.
  final NestingLevel? parent,

  /// The number of characters that this nesting level is indented relative to
  /// the containing level.
  ///
  /// Normally, this is [Indent.expression], but cascades use [Indent.cascade].
  final int indent,
) {
  /// The total number of characters of indentation from this level and all of
  /// its parents, after determining which nesting levels are actually used.
  ///
  /// This is only valid during line splitting.
  int get totalUsedIndent => _totalUsedIndent!;
  int? _totalUsedIndent;
}

The downside, of course, is that this is a bit more verbose and a little different coming from other languages whose primary constructor is right after the class name. In cases where there isn't much else in the type header, there are few fields, and they aren't documented, I think the classic primary constructor syntax looks better. But once the type scales up (and in particular, once you document your fields, which I think is generally a good idea), it gets kind of hard to read.

Aug 02 '22 18:08 munificent

Compare that to what you'd get using the current proposal:

This is assuming no changes to documentation conventions, which I think is not realistic. From a brief look at some kotlin code, the equivalent might look more like:

  /// @param _allArguments The full argument list from the AST.
  /// @param _positiional The positional arguments, in order.
  /// @param _named The named arguments, in order.
class ArgumentSublist(
  final List<Expression> _allArguments,
  final List<Expression> _positional,
  final List<Expression> _named,
) extends Rule<Expression> implements FormatSpan {
  /// The number of leading block arguments, excluding functions.
  ///
  /// If all arguments are blocks, this counts them.
  final int _leadingBlocks;

  /// The number of trailing blocks arguments.
  ///
  /// If all arguments are blocks, this is zero.
  final int _trailingBlocks;

  void visit(SourceVisitor visitor) { ... }
}

Which looks fine to me (nit, I don't understand how the extra fields work in this class, since they're not initialized in the constructor?)

One way of looking at this is that intuitively, we write Foo<X, Y> for generic classes, and the intuition is basically that the type parameters are "parameters" to the class. And at the invocation site, you write them in the same place: Foo<int, int>(...arguments). The same intuition seems to me to carry over naturally: the constructor parameters are parameters to the class, and in an invocation, you put the arguments immediately after the generic parameters (or the class name if none). So using the "parameter" syntax immediately after the classname/generics seems very intuitive to me.

You can use different keywords. In my strawman, you could use const instead of new to make the primary constructor a const constructor.

Is there any reason not say that every primary constructor is a const constructor (at least if the superclass has a const constructor)?

We could also allow you to use final to default to making all fields final. So the first example becomes:
class Rect final (
  int x,
  int y,
  int width,
  int height,
);
We could then do the same thing for struct which would allow you to define value types with mutable fields. (Which are, admittedly, dubious, but a thing users do in practice.)

We could. It looks pretty weird to me though.

You can provide a constructor name.

This really feels a bit over-generalized to me. If you want a named constructor, just write the constructor.

The downside, of course, is that this is a bit more verbose

This is really the rub. My sense is that the more we generalize this, the more we lose the actual benefits. Your data scraping suggests that a huge majority of classes don't need the generality. So every bit of generality that we add that makes that majority more verbose has a massive incremental cost in aggregate, and only benefits a few niche cases.

Aug 03 '22 05:08 leafpetersen

Is there any reason not say that every primary constructor is a const constructor (at least if the superclass has a const constructor)?

We'd need to give you a way to opt out of being a const constructor if you don't want it. You may not want it if you plan to add further, non-final, fields to the class in the future. That will be a breaking change if the constructor is implicitly made const without you asking for it. In general, locking people into a constraint by default is dangerous. Even more to people who don't know about it. Those who do can usually come up with a workaround.

... every bit of generality that we add that makes that majority more verbose has a massive incremental cost in aggregate, and only benefits a few niche cases.

That's a very good point. The only counter-point is that ever feature we make default and automatic causes an extra step if you ever need to migrate away from the shorthand. If we make a primary constructor implicitly const, you need to remember to write const when you migrate off using primary constructors. (That's probably the smallest such issue, so not really an argument for not making it default to const. Not having an opt-out other than migrating away from the primary constructor is a bigger issue to me).

Aug 03 '22 12:08 lrhn

We'd need to give you a way to opt out of being a const constructor if you don't want it.

FWIW I think that majority of Dart developers don't concern themselves with such matters because they are not writing reusable code.

So I think we should not optimise defaults towards the minority that does.

Aug 03 '22 12:08 mraleph

@ lrhn

We'd need to give you a way to opt out of being a const constructor if you don't want it. You may not want it if you plan to add further, non-final, fields to the class in the future. That will be a breaking change if the constructor is implicitly made const without you asking for it. In general, locking people into a constraint by default is dangerous. Even more to people who don't know about it. Those who do can usually come up with a workaround.

To be slightly provocative, maybe the answer is to say "if you don't want it to be const, don't use a primary constructor". As @mraleph says, I think there is a lot of value for a feature like this that you don't have to use in optimizing strongly for the common case.

To be slightly less provocative, we could at least say that getting an implicit const constructor is part of the deal with structs/data classes. That is, if you say data class, you are opting in to implicitly final fields and implicit const constructor.

Not having an opt-out other than migrating away from the primary constructor is a bigger issue to me

I hear this, but I also think that there is an inherent cliff here. If you want a constructor body, you have to migrate away. If you want to delegate, you have to migrate away. If you want to initialize some fields in the initializer list, you have to migrate away. So saying that if you want non-const you have to migrate away doesn't feel that bad to me.

Aug 03 '22 19:08 leafpetersen

This is assuming no changes to documentation conventions, which I think is not realistic.

That's a good point. Hoisting all the field docs alleviates much of my readability concerns.

One way of looking at this is that intuitively, we write Foo<X, Y> for generic classes, and the intuition is basically that the type parameters are "parameters" to the class. And at the invocation site, you write them in the same place: Foo<int, int>(...arguments). The same intuition seems to me to carry over naturally: the constructor parameters are parameters to the class, and in an invocation, you put the arguments immediately after the generic parameters (or the class name if none). So using the "parameter" syntax immediately after the classname/generics seems very intuitive to me.

Yeah, I agree it is 100% intuitive to have the parameters right there. I just think it looks funny when you end up having the extends/implements/with clauses jammed between the primary constructor and the class body. But... I'm convinced that it's the least bad approach.

Is there any reason not say that every primary constructor is a const constructor (at least if the superclass has a const constructor)?

No, good point.

This is really the rub. My sense is that the more we generalize this, the more we lose the actual benefits. Your data scraping suggests that a huge majority of classes don't need the generality. So every bit of generality that we add that makes that majority more verbose has a massive incremental cost in aggregate, and only benefits a few niche cases.

Yes, I think I'm sold. I've poked around a bunch of Kotlin code and it does look weird to me to have the superclasses and superinterfaces wedged between the primary constructor and class body. But in practice, it seems like most classes with complex inheritance hierarchies don't use primary constructors. For those that do... it looks a little weird (and people seem to format them in a variety of creative ways), but not intolerable.

OK, so what I'd suggest then is:

A primary constructor is a parameter list that appears directly after the struct or class name. It can have positional, optional, named, and required parameters as the user wants.
Each parameter in the list (that isn't a super. parameter) becomes a field on the type initialized by that parameter. The field is implicitly final in a struct and final if the parameter is marked final in a class.
It can contain super. parameters which implicitly get forwarded to the superclass constructor the way they do in a normal constructor declaration.
The primary constructor is implicitly const.
The class may define other constructors (generative, redirecting, or factory) as long as those constructors meet all of the normal obligations of initializing final fields, etc.
The class may also define other fields as long as it doesn't cause problems that the primary constructor doesn't initialize them: they are either initialized at their declaration, late, or nullable.
A class can omit its {} body and use ; instead if empty.
It's probably reasonable to do what @lrhn suggests and allow a constructor name before the parameter list too:
```
class Foo.name(int x);
```

There's the weird wrinkle around private named fields as named parameters in the primary constructor. I think I'd be OK with saying that you just can't do that.

Aug 04 '22 00:08 munificent

What @munificent says.

Parameter list occurs after class name — and after type parameters if any.

I'm actually, uncharacteristically, fine with allowing the field names in the parameter list to be private, and automatically make them public in the implicitly added constructor. It's reasonable to want private fields, and unreasonable to have private parameter names. Something needs to be tweaked. (I'd even be willing to contemplate making the name of the parameter of the common this._foo be foo, but that's potentially breaking if it's referenced as _foo later in the parameter list.)

I'm now OK with making the constructor const if possible (superclass constructor is const, any further fields in a class declaration are non-late and final - and therefore necessarily initialized with a constant.)

Biggest issue: Do we need a way to specify a super-constructor other than the unnamed one?

If we allow Foo.name(int x) for a primary constructor, we'd also want to be able to call that from a subclass primary constructor. Maybe we can heuristically say that the primary constructor calls the superclass constructor:

which is a primary constructor, if the superclass has one
otherwise the one with the same name (empty/new name if unnamed), if it exists,
otherwise use the unnamed constructor, if it exists.

Most people will just use the unnamed primary constructor for everything, and that'll just work.

On second thought, there is one problem with implicitly inferring const for the constructor. (Other than getting people locked into it without them knowing it.) If the primary constructor is const when:

The superclass constructor is const, and
Default values of primary constructor parameters are const, and
Any fields added to the class supports being const (not late, final, and is nullable or has an initializer which is constant, even though its context isn't constant).

then whether the constructor actually is const will depend on very fragile and accidental choices.

Adding a field like final int x = 42; will preserve const-ness. Changing it to static final int _defaultX = 42; final int x = _defaultX; will make the constructor non-constant. (If _defaultX had been constant, it would work.) There is no warning, unless you check that the constructor can be used as const, which you might not care about (since you just broke it without noticing).

I think that's generally going to be too fragile. I'd recommend you having to write const to get a const constructor, say:

const class Foo(int x, int y);

That's an explicit opt-in to the primary constructor being constant. It makes it easy to give errors if some other part of the class doesn't support being const, rather than just silently not being constant. If you don't think about const-ness, you won't accidentally promise that the class can be used for constants.

Yes, it's one more word, and it'll likely be used a lot, but as long as we don't have const-by-default everywhere else, and an opt-out word for non-const-ness, I think we have to stick to the rule that const is not implicit, because it's a big promise you make in your API.

Aug 08 '22 10:08 lrhn

I think that's generally going to be too fragile. I'd recommend you having to write const to get a const constructor, say:
const class Foo(int x, int y);
That's an explicit opt-in to the primary constructor being constant. It makes it easy to give errors if some other part of the class doesn't support being const, rather than just silently not being constant. If you don't think about const-ness, you won't accidentally promise that the class can be used for constants.

Yeah, I think I agree that it's too fragile, and I'd be fine with this choice (to put const before the class). For structs/data classes (if we do them) perhaps it would be reasonable to say that data class means both immutable and const though?

Aug 09 '22 20:08 leafpetersen

Data classes/structs can be constant if their superclass is constant (and the superclass must be an abstract struct or the Object (or Struct, if we have it) class, so that should hold inductively), and if their primary constructor initializer/default-value expressions are constant (or at least potentially constant).

The current proposal allows non-potentially-constant initializer expressions. That means that

struct Foo({List<int> indices = [0]});

cannot have a constant constructor.

Again it becomes fragile to infer const for the constructor, because a slip of the hand, like writing = [] instead of = const[], will turn the struct from constant to non-constant without any real warning.

If initializer expressions have to be constant for structs, then we make all structs const, but I think that'll be too restrictive. There will be uses for structs that have mutable default values.

I mentioned earlier that we could have separate syntaxes for constant default values and non-constant initializers, say:

struct Foo({int x = 0, List<int> l ??= <int>[0]});

That would allow a struct with only = default values to be implicitly const, and using ??= being the way to opt out. You still don't have a way to opt out of providing a const constructor other than introducing a ??= initializer.

I'd still prefer to go with const struct Foo(int x, int y) to make the primary constructor const, rather than making it implicit.

Aug 10 '22 11:08 lrhn

Cf. https://github.com/dart-lang/language/pull/3023, a concrete proposal about this feature.

May 12 '23 07:05 eernstg

I'm very skeptical about the value of this feature. It adds a redundant syntax and no expressiveness. My intuition suggests that is not going to help with readability. I could be wrong though.

I think if we want to go down this path we should start with usability testing to ensure that it is a clear win for readability and maintainability, even in the kind of cases where the syntax is mixed with other syntaxes. Can people accurately determine that some syntaxes are identical in meaning? Can they accurately determine how to extend a particular case to add a new getter or method? Do they understand how this syntax works with superclasses, interfaces, or mixins?

May 16 '23 05:05 Hixie

@Hixie wrote:

.. skeptical about the value of this feature. It adds a redundant syntax and no expressiveness.

That's true, it is simply an abbreviated way to write something which is already expressible. Note that super-parameters have exactly the same nature, and so does => function syntax, and this.x as a constructor parameter, and more.

In any case, input from usability testing would be very useful!

Can they accurately determine how to extend a particular case to add a new getter or method? Do they understand how this syntax works with superclasses, interfaces, or mixins?

Great questions! They should be part of the usability testing.

I'll comment on them here as well. There is not much of a connection to getter and method declarations, they are not affected by the primary constructor proposal. But the interaction with all kinds of superinterfaces, and with any other Dart feature, follow from the meaning of the primary constructor.

One way to understand what a primary constructor means is by desugaring: Replace the primary constructor syntax (something like const D.name(...)) by the class name (just D), and use it to create a new constructor in the body (const D.name(...);). For each parameter which isn't this. or super., declare an instance variable with the same name x and the given type, and change the parameter to this.x. Modifiers final and covariant are used when declaring that instance variable and removed from the constructor parameter.

Everything else follows. For instance, if the desugaring introduced an instance variable, say, final int x;, then this declaration can override a getter int get x => 5; in the superclass, or implement int get x from a class that it implements. Nothing new, nothing surprising, we just need to know that the given primary constructor introduces that instance variable.

// This:
class D._(int x, {final int y = 0}) extends A with M implements B, C;

// .. means this:
class D extends A with M implements B, C {
  int x;
  final int y;
  D._(this.x, {this.y = 0});
}

I tend to think that it's a significant abbreviation, and not that hard to read. But that assessment would of course be tested in a crucial manner by some usability studies.

May 16 '23 08:05 eernstg

I'm very skeptical about the value of this feature. It adds a redundant syntax and no expressiveness.

@Hixie I can understand that, but the stance seems inconsistent with the immense demand for https://github.com/dart-lang/language/issues/314

May 16 '23 08:05 mit-mit

I think we should address problems like #314 using a more general mechanism like macros.

That's true, it is simply an abbreviated way to write something which is already expressible. Note that super-parameters have exactly the same nature, and so does => function syntax, and this.x as a constructor parameter, and more.

I think the main difference between those and this proposal is that the delta between the two forms is much less disruptive and more intuitive.

For example, you can change one argument from this.x to Foo x without affecting the others. You can change one super.foo named parameter to a a non-super one without affecting the others. When you replace the body of a method, there's no other feature you might accidentally run into that changes the result. The code remains roughly in the same place in all these cases.

On the other hand, with this primary constructor syntax any number of things make a radical difference to the code:

adding a body to the constructor
adding an assert to the constructor
changing how one of the constructor arguments is handled (e.g. taking a string and parsing it into an int to set a field)
changing how the constructor arguments map to superclass constructor arguments
having the type of the argument not match the type of the field (e.g. argument is int, field is num)

While I believe we should use a feature like macros to address this use case, in the event that we don't, I think it might be possible to find a design that doesn't suffer from the problems I describe above. For example, we could add a syntax to existing constructors that declares the field for you:

class Foo {
  const Foo(int field.x, int field.y);
}

This would have the properties of this.foo and super.foo, like allowing mixing and matching:

class Foo {
  Foo(int field.x, int field.y, double r) : assert(r > 0), d = r * 2 {
    // ...
  }

  final double d;
}

May 17 '23 06:05 Hixie

@Hixie wrote:

with this primary constructor syntax any number of things make a radical difference to the code:

adding a body to the constructor

So the issue here is that seemingly small changes to a given constructor would be syntactically small changes as long as that constructor is a body constructor, but they are suddenly much more disruptive when the constructor is a primary constructor.

I like the idea that @leafpetersen came up with, which is mentioned in the discussion part of the spec proposal.

In short, you can use all the parameter list features of a primary constructor in a body constructor by adding var as the very first token. It is an error to have two or more var constructors, and it is an error to have a var constructor if there is a primary constructor.

// Using a primary constructor.
class Foo(int x, int y);

// If we want to change the constructor beyond what we can do with a primary
// constructor, prepare by making it a `var` constructor:
class Foo {
  var Foo(int x, int y);
}

// Then we can add new elements like any other constructor.
class Foo {
  final int z;
  var Foo(int x, int y) : z = x + y, assert(x != y), super() {
    ... // We can have a body.
  }
}

So the idea is that the var constructor makes it less disruptive to perform all those changes. You could say that it is confusing that some instance variables are declared by that var constructor, but it would probably help a lot if we recommend a style where a var constructor (if any) occurs textually together with the instance variable declarations.

Another approach could be to have a quick fix in IDEs to change a class from using a primary constructor to a class which is equivalent, but uses a regular constructor and declares all variables in the normal way.

adding an assert to the constructor

This would be covered by var constructors and by the quick fix.

changing how one of the constructor arguments is handled (e.g. taking a string and parsing it into an int to set a field)

Sounds like this would be expressed using an element in the initializer list, which would be covered in the same way.

changing how the constructor arguments map to superclass constructor arguments having the type of the argument not match the type of the field (e.g. argument is int, field is num)

OK, so the primary constructor argument has type int, and it is a super-parameter, and it is passed to an initializing formal of type num. That's covered by primary constructors already:

// Today:

class A {
  num x;
  A(this.x);
}

class B extends A {
  B(int super.x);
}

// With primary constructors:

class A(num x);
class B(int super.x) extends A;

While I believe we should use a feature like macros to address this use case

That might be possible, too. It would probably be somewhat more verbose.

@jakemac53, what do you think about doing primary constructors as a macro? How would it look?

May 17 '23 09:05 eernstg

@jakemac53, what do you think about doing primary constructors as a macro? How would it look?

There is an example data class macro here, the usage looks like this. Basically I had you just define the fields normally and then generated a constructor to initialize them (it would suffer the same issues as described for primary constructors).

However you could also write a macro that worked based on a manually written constructor, where the parameters describe the fields that should be added, and we do allow you to fill in initializers and/or a constructor body, or even augment existing initializers and/or constructor bodies.

May 17 '23 14:05 jakemac53

Thanks, @jakemac53!

May 17 '23 15:05 eernstg

Could it be possible to allow the IDE to convert from a primary constructor to a normal constructor? Such as CTRL + . >> Move constructor to class body.

Could it be possible to allow users to replace macro annotations with their desugared form? Like the "delombok" tool (https://projectlombok.org/features/Data & https://projectlombok.org/features/delombok).

Which macro annotations will exist for sure in the Dart SDK? @allArgsConstructor or @data? @json? @toString, @hashCode, @toStringAndHashCode? @with?, @namedConstructor or @positional?, @record, @immutable.

May 18 '23 23:05 Wdestroier

I think if we want to go down this path we should start with usability testing to ensure that it is a clear win for readability and maintainability, even in the kind of cases where the syntax is mixed with other syntaxes. Can people accurately determine that some syntaxes are identical in meaning? Can they accurately determine how to extend a particular case to add a new getter or method? Do they understand how this syntax works with superclasses, interfaces, or mixins?

@inmatrix Thoughts on this?

May 19 '23 01:05 leafpetersen

It's a long thread and I'm afraid I'm not able to get up to speed right now. If you'd like to discuss the potential UX questions and which of them usability testing can and cannot answer, maybe we can have a meeting? It would be helpful if each side of the argument can includes a few examples.

May 19 '23 20:05 InMatrix

A meeting about the usability related questions would be great! I've attached a bunch of examples of concrete syntax using this feature specification proposal, see examples-primary-constructor.txt.

The examples can be used to browse around and see how various combinations of features would look. The examples may serve as a raw material for arguments in any direction (e.g., that it is delightfully concise, or that it is impossible to read ;-).

To find examples involving specific features: Look at the very first examples in order to see some simple cases; search for TypeVariable in order to see examples where the class has type variables, search for implements in order to see examples where the class implements some other classes; search for [ to see examples where an optional positional parameter is used; search for Point._ in order to see examples where the primary constructor is named; etc.etc.

May 22 '23 09:05 eernstg

I did some scraping to see how well @eernstg's primary constructor proposal (#3023) would work in practice. I looked the 2,000 most recent pub packages (~12MLOC).

Of the 81,968 that declare at least one generative constructor:

-- Could have primary (81968 total) --
  63581 ( 77.568%): Yes  ===========================================
  18387 ( 22.432%): No   =============

So more than 3/4 of classes with generative constructors could benefit from this feature.

Looking at all classes:

-- Class (117300 total) --
  61317 ( 52.274%): Only generative constructor can be primary      =======
  35332 ( 30.121%): No generative constructors                      ====
  18387 ( 15.675%): None of generative constructors can be primary  ==
   2264 (  1.930%): One of generative constructors can be primary   =

So a little over half of classes declare a single generative constructor which could be a primary constructor if we had this feature. A very small number of classes have multiple constructors, one of which could be primary. So it's only marginally useful to support having both a primary constructor and other body constructors (but I still think we should support that).

The fact that we have super. parameters and would allow them in primary constructors significantly increases the usefulness of the feature:

-- Has super init (63883 total) --
  44592 ( 69.803%): No   =======================================
  19291 ( 30.197%): Yes  =================

Of the constructors that could be primary constructors in this analysis, about a third of them wouldn't be if we didn't have super..

The reasons why a particular constructor can't be primary:

-- Reason (23298 total) --
   3717 ( 15.954%): initializer, parameter
   3296 ( 14.147%): superclass argument
   3283 ( 14.091%): body
   2896 ( 12.430%): body, parameter
   1947 (  8.357%): named superclass constructor
   1412 (  6.061%): initializer
   1314 (  5.640%): parameter, superclass argument
    718 (  3.082%): assert initializer
    658 (  2.824%): body, named superclass constructor
    640 (  2.747%): body, initializer, parameter
    636 (  2.730%): parameter, redirecting constructor
    296 (  1.270%): initializer, parameter, superclass argument
    286 (  1.228%): parameter
    278 (  1.193%): body, initializer, named superclass constructor, parameter
    264 (  1.133%): named superclass constructor, superclass argument
more...

Basically all over the place. But most often it seems to be because there's a constructor parameter that doesn't directly map to a field or super initializer, or because there's a constructor initializer that isn't just assigned from a field.

Overall, this proposal look like a slam dunk to me. Note that the numbers here without any special handling for initializing private fields with named parameters (#2509). This is just following the proposal in its current state.

Jun 28 '23 22:06 munificent

Just because a feature could be applied to some code, or even a lot of code, doesn't mean it's a valuable feature. I don't think anyone is arguing this feature couldn't apply broadly. To summarize my concerns as described in earlier comments:

it is a syntax sugar that introduces a cliff where any slight deviation from the supported set requires a disproportionate amount of effort.
we have not demonstrated that this syntax would improve readability. It reduces the simplicity of the language by having multiple ways to do something.

I think my concern stems primarily from the way that we're stuffing the constructor into an unrelated part of the class declaration. The only win seems to be avoiding the {} braces and the duplication of the class name.

(FWIW, Kotlin has a very similar syntax which, in my somewhat limited experience playing with Kotlin, has exactly the same problems I'm worried about here: as soon as you try to do anything outside their limited supported set of features, you have to switch to completely different syntax to get it done. I find this concerning because it means developers have to learn two sets of syntaxes, plus understand the limitations of each, plus understand the magical implications of each (e.g. const implying the fields are final)...)

Instead of:

class Point(int x, int y);

...why not:

class Point {
  Point(int field.x, int field.y);
}

...where field, in a manner similar to super and this, means "declare a field for this parameter"?

class Point<TypeVariable extends Bound> extends A with M implements B, C {
  const Point(final int field.x, {required final int field.y});
}

class Point {
  Point(final int field.x, final int field.y) : assert(math.max(x, y) > math.min(x, y) * 2);
}

This syntax solves all the same problems except one duplication of the constructor name (which seems like a rather minimal issue -- that said, if we had virtual constructors then we'd want to have a way to declare constructors without using the class name anyway, probably), and doesn't have any of the problems of the proposed syntax, because it's strictly a superset of the current features.

Jun 28 '23 23:06 Hixie

@Hixie the problem with that syntax is that is very hard to read.

I've had a lot of trouble with TS classes when I needed to a find a property declaration and it was in the constructor.
A constructor is basically a function with arguments. My eyes just ditch constructors for properties declaration all of the times.

The proposal way is much better because you can easily differentiate full properties from sugar properties.

Jun 28 '23 23:06 jodinathan

I don't understand. The proposal here makes no distinction at all. What are the properties in this example?

class const Point<TypeVariable extends Bound>(this.x, {required int y})
    extends A with M implements B, C {
  final int x;
}

Jun 29 '23 03:06 Hixie

language language copied to clipboard

Primary constructor on classes

Introduction

Discussion

language
language copied to clipboard