language icon indicating copy to clipboard operation
language copied to clipboard

Generalize compound assignments to include `.=`

Open eernstg opened this issue 6 months ago • 5 comments

Here's an idea which could be explored in order to enable some transformations which are somewhat verbose and inconvenient in Dart today.

Consider the following program:

(int, int) _process(int i, int j) => (i + 1, j - 1);

void main() {
  var x = 1, y = 2;
  (x, y) = _process(x, y);
}

This kind of transformation may arise because we're working on a number of variables, and we've made the choice to extract a piece of work into a separate function (here: _process) in order to make the main function (here: main) smaller and more readable.

It could be argued that this should be done by introducing a class where those variables are instance variables and the computation is an instance method:

class A {
  int x, y;
  A(this.x, this.y);
  void _process() {
    x = x + 1;
    y = y - 1;
  }
}

void main() {
  var a = A(1, 2);
  a._process();
}

It could be argued that this is a cleaner solution because (presumably) the class A and the instance variables x and y could now be renamed in such a way that that class and its state and methods together amount to a meaningful abstraction, and the whole thing is much more natural and easy to understand.

This generalizes nicely in the sense that we can now add more methods to the class in order to handle other tasks where those variables are processed, including the case where _process is getting too large and we split it into multiple methods of the same class.

However, I believe that there will be many situations where such abstractions are ephemeral, and the design work that goes into building this abstraction isn't justified. For example, we might want to process a different but overlapping set of variables a few lines after having done a._process(). We surely don't want to extract x and y from a and put them into another short-lived object in order to process that different set of variables: This will immediately create complexity concerned with the fact that "the same thing" is now stored in a.x and in thatNewHelperObject.someName. We could also make the class bigger and handle all the steps of processing inside the same class—but that creates a similar level of complexity in the class that we wanted to eliminate from the main function in the first place.

So let's consider some approaches that are different from "Just use standard OO abstractions".

One well-known mechanism which could allow us to abbreviate the original form is call-by-reference parameters:

// Assuming that Dart adds support for call-by-reference parameters, and
// assuming that they are both declared and passed using `&`.

void _process(int& i, int& j) {
  i = i + 1;
  j = j - 1;
}

void main() {
  var x = 1, y = 2;
  _process(&x, &y);
}

We could do this, and it would solve the original issue very directly. So let's keep that in mind.

However, it is certainly possible that Dart will not add support for call-by-reference parameters, not just because of the need to prioritize the work on new features, but also because call-by-reference parameters could have significant costs in terms of performance or other language characteristics. For example, it might make garbage collection more complex if it implies that Dart programs can have references (that is: pointers) to instance variables that are located in the middle of a heap object or in the middle of an activation record on the stack.

We could, however, do something entirely different: we could generalize compound assignments to handle these cases.

The basic idea in a compound assignment is that we're abstracting over an expression that has a repeated element by introducing new syntax where this element isn't repeated. For example, x = x + 1 is expressed more concisely as x += 1.

A form which is structurally similar to x = x + 1 is (x, y) = (x, y)._process, which would then be expressible as (x, y) .= _process. Here is the original example, adjusted to use this mechanism:

extension on (int, int) {
  (int, int) get _process => ($1 + 1, $2 - 1);
}

void main() {
  var x = 1, y = 2;
  (x, y) = (x, y)._process;
  // With this proposal, we could write the same thing like this:
  (x, y) .= _process;
}

This kind of compound assignment would support at least the following kinds of terms on the left hand side:

  1. An <assignableExpression>.
  2. An <outerPattern> which is also an expression.

The latter includes the case we've already seen, (x, y) .= _process. A couple of other examples:

void main() {
  var myVariableWhoseNameIsLong = "Hello, world!";
  myVariableWhoseNameIsLong .= substring(0, 5);

  var xs = [1, 2, 3];
  xs .= map((x) => x + 10);

  var x = 100, y = 200, z = 300, w = 400;
  [x, y, z, w] .= reversed.toList();
}

In general, the mechanism would be that e .= t means e = e.t, with the usual caching that ensures that subterms of e aren't evaluated multiple times:

class A {
  int x;
  A(this.x) {
    print("Created a new A");
  }
}

void main() {
  A(-10).x .= abs();
}

This program would print Created a new A once, not twice, because it has the following semantics:

void main() {
  final _tmp = A(-10);
  _tmp.x = _tmp.x.abs();
}

Finally, in order to support the original example directly, we could allow the right hand side of a .= compound assignment to be a function that accepts the left hand side as an actual argument or argument list. In short, we would allow e .= t to mean e = t(e), and (e1, e2, named: e3) .= t would mean (e1, e2, named: e3) = t(e1, e2, named: e3). For example:

(int, int) _process(int i, int j) => (i + 1, j - 1);

void main() {
  var x = 1, y = 2;
  (x, y) .= _process;
}

This treatment would kick in when the right hand side of .= is a sequence of selectors where the first element is an identifier, and that identifier is not the name of any member of the static type of the implicit receiver (which is the left hand side of .= considered as an expression).

eernstg avatar Jun 06 '25 09:06 eernstg

My immediate response is that I don't like the syntax. The (x, y) .= process moves the . so far away from the process that it's no longer ready to see that .process is a member invocation.

The syntax makes sense for assignment patterns and single variables, so cursor .= next; would be the new way to operate a linked list. I could probably get used to that.

For more complicated patterns, it's treating a pattern like an expression. That's not necessarily possible. Or maybe the restrictions on assignment patterns are enough to guarante that we can create a value based on the initial values of the assigned variables, but it feels counterintuitive that a destructuring pattern also constructs first.

Here (x, y) .= process would create a new pair. But you can also do MapEntry(key: x, value: y) = ... as an assignment pattern, and there is no guarantee that you can construct a MapEntry from that. It could have been an abstract interface.

lrhn avatar Jun 06 '25 10:06 lrhn

The (x, y) .= process moves the . so far away from the process that it's no longer ready to see that .process is a member invocation.

It's literally two characters away! ;-)

In particular, we can read (x, y) = ... off of (x, y) .= ... by deleting the period, and we can read (x, y).process off of (x, y) .= process by deleting the = (and the whitespace, if you wish), yielding (x, y) = (x, y).process. This is extremely similar to the corresponding reading of x += 2.

Another example is s .= substring(5, 0), which can be read as an expression which is related to s = ... and related to s.substring(5, 0), yielding s = s.substring(5, 0).

One thing that this mechanism does require is that readers familiarize themselves with this construct: It is not known from other languages (as far as I know), and it's quite powerful. It will take some time to find the exact set of cases where the mechanism should be supported.

The potential variant that I mentioned at the very end includes a number of obvious cases (like the one I mentioned), but this must also be fleshed out in detail. In particular, I'd expect the following to work as indicated in the comments:

(int, int) foo(int i, int j) => (i + 1, j - 1);

int bar(int i) => i * 2;

void isEven() {}

void main() {
  var x = 1, y = 2;
  (x, y) .= foo; // Meaning `(x, y) = foo(x, y)`
  x .= bar; // Meaning `x = bar(x)`.
  x .= isEven(); // Error, a `bool` object can not be invoked.
}

In the last statement, isEven denotes the instance member isEven on int because the instance members have priority and the potential mechanism where the right hand side is a function only kicks in when the interface of the receiver (here: x with interface int) doesn't have a member with the given name.

eernstg avatar Jun 06 '25 14:06 eernstg

This seems totally unnecessary. We already have reasonable ways to process items.

var x = 0, y =1;
(x, y) = process(x, y);
// Or
var pair = (0, 1);
pair = pair.process();

Afaik that's all we need. For anything else, we have classes.

The syntax is kinda terrible too.

TekExplorer avatar Jun 06 '25 15:06 TekExplorer

It's literally two characters away! ;-)

Including a space, and that's really the most important part. The . is not connected to the name, which it otherwise is for member accesses. A space has a very big effect on reading.

lrhn avatar Jun 06 '25 15:06 lrhn

I also don't think it is that useful, and the syntax is not great.

If this becomes a thing I would like to have a linter suggesting to avoid this syntax.

Well, I don't even like ++ or +=, I usually write a = a + b instead, but this is obviously a personal and subjective opinion.

mateusfccp avatar Jun 06 '25 15:06 mateusfccp

I agree with the proposal and believe it is most definitely useful for those of us who look at the code more visually, structurally, and in large chunks at once. Those who do not want to use the feature do not have to and standard lints can discourage its use. I would be willing to use the syntax even if I had to explicitly enable it or configure lints just to use it

RohitSaily avatar Jun 24 '25 16:06 RohitSaily