go spec: clarify order of operand evaluations in a value assignment

(This issue is separated from https://github.com/golang/go/issues/27804 after I realized it is not a proposal but just a doc improvement. There are no language changes in the improvement.)

Currently, for assignments in Go, Go specification specifies

The assignment proceeds in two phases. First, the operands of index expressions and pointer indirections (including implicit pointer indirections in selectors) on the left and the expressions on the right are all evaluated in the usual order. Second, the assignments are carried out in left-to-right order.

The rule description is some ambiguous. It doesn't specify clearly how the operands in the mentioned expressions on the left should be exactly evaluated. This is the cause of some disputes.

For the following program, if it is compiled with gccgo (version 8), then it will panic. But if it is compiled with the standard Go compiler, then the program will exit without panicking.

package main

func main() {
	s := []int{1, 2, 3}
	s, s[2] = []int{1}, 9 // gccgo 8 think s[2] is index of range
}

Obviously, gccgo think the multi-value assignment is equivalent to

	s = []int{1}
	s[2] = 9

However, the standard Go thinks it is equivalent to

	tmp1 = &s
	tmp2 = &s[2]
	*tmp1, *tmp2 = []int{1}, 9

Most Go team members think the interpretation of the standard Go compiler is what Go specification wants to express. I also hold this opinion.

To avoid description ambiguities, it would be good to append the following description to the rule.

In the first phase, if the map value in a destination expression is not addressable, the map value will be saved in and replaced by a temporary map value. Just before the second phase is carried out, each destination expression on the left will be further evaluated as its elementary form. Different destination expressions have different elementary forms:

If a destination expression is a blank identifier, then its elementary form is still a blank identifier.

If a destination expression is a map index expression m[k] then its elementary form is (*maddr)[k], where maddr is the address of m.

For other cases, the destination expression must be addressable, then its elementary form is a dereference to its address.

I think these supplement descriptions can avoid the above mentioned disputes.

[update at Oct. 18, 2019]: an imperfection is found by @mdempsky. The below is a simpler alternative improved description:

.... Second, the assignments are carried out in left-to-right order. The assignments carried-out during the second phase don't affect the expression evaluation results got at the end of the first phase.

Sep 23 '18 08:09 zigo101

For what it's worth, tip gccgo behaves like gc. See #23188.

Sep 24 '18 14:09 ianlancetaylor

I think the proposal is simple, but I think it's backwards incompatible in a subtle way that I don't think we want to commit to.

In particular, today we guarantee x[i] = f() will evaluate f() before bounds checking i. Experimentally, it seems like Python, Java, and JavaScript do as well.

If we define that x[i] = f() means t0, t1 := &x[i], f(); *t0 = t1 then it opens up the opportunity for compilers to bounds check x[i] before f().

I'll also emphasize this does not reflect how cmd/compile implements parallel assignments.

Oct 16 '19 20:10 mdempsky

@mdempsky Is you last comment intended for this issue instead: https://github.com/golang/go/issues/27804 ?

There is no disputes in this issue now. This issue is to suggest that more descriptions are needed in spec.

Oct 17 '19 23:10 zigo101

@go101 No, I commented on this issue as intended.

Today, this code:

var x [100]int
func f() int { ... }

x[i] = f()

means:

ti, tf := i, f()
x[ti] = tf

It does not mean:

tp, tf := &x[i], f()
*tp = tf

In particular, today a Go compiler is not allowed to panic because i >= 100 until after evaluating f(). But your proposed rules suggest otherwise.

Oct 17 '19 23:10 mdempsky

The current issue only tries to clarify the rule before the second phase (carry-out phase) in executing a multi-value assignments. More specifically, it tries to clarify that the addresses of the target values must be confirmed before executing the second phase. It is totally single-value assignments unrelated.

For your example, this issue doesn't try to clarify the evaluation order of x, i and f().

[Update]: What this issue tries to clarify is the expressions x, i, &x[i] and f() must all be evaluated before executing the second phase. &x[i] is surely evaluated after x and i, but for other orders, this issue doesn't specify.

Oct 18 '19 02:10 zigo101

@mdempsky I modified the last comment a bit.

The proposed description in the first comment is probably too long. The following shorter one might be better:

The assignments performed during the carry-out phase have not any effects on any destination expression evaluation.

Oct 18 '19 03:10 zigo101

More specifically, it tries to clarify that the addresses of the target values must be confirmed before executing the second phase.

But that's not the case today, so it's not a clarification.

For example:

package main

func main() {
	x := "before"
	func() {
		defer func() { recover() }()
		x, *(*int)(nil) = "after", 0
	}()
	println("panic happened", x, "assigning to x")
}

This program prints panic happened after assigning to x with both cmd/compile and gccgo.

Oct 18 '19 05:10 mdempsky

It executes as expected:

t0, t1, t2, t3 := &x, (*int)(nil), "after", 0 // the evaluation order is not specified

// Now, phase 1 done, the addresses of all target values are confirmed.
// (*int)(nil) is a valid address value.
// To start phase 2.

// In phase 2, assignments are executed by the following order:
*t0 = t2 // x == "after" now
*t1 = t3 // panic

This issue tries to clarify that the execution of *t0 = t2 should not affect the value of t1.

Oct 18 '19 08:10 zigo101

Why t1 := (*int)(nil)? You can't just simplify &*(*int)(nil) to (*int)(nil). The former nil panics, whereas the latter does not.

Alternatively, consider this:

package main

func main() {
	x := "before"
	func() {
		defer func() { recover() }()
		s := []int{1}
		x, s[2] = "after", 0
	}()
	println("panic happened", x, "assigning to x")
}

Based on your original comment, it sounds like you think the x, s[2] = "after", 0 assignment evaluates as:

tmp1 := &x
tmp2 := &s[2]
*tmp1, *tmp2 = "after", 0

But it doesn't.

Oct 18 '19 08:10 mdempsky

Eh, yes, the original description is not perfect. I have updated it to a simpler version.

Oct 18 '19 09:10 zigo101

Do I understand correctly that your proposal is now just:

The assignments performed during the carry-out phase don't affect the destination expression evaluation results confirmed in the first phase.

?

If so, I find this more confusing than the spec wording it's meant to clarify. E.g., it uses phrases that don't appear anywhere in the Go spec like "carry-out phase" and "confirmed".

I also don't think there's any dispute over the idea that assignments that happen during phase 2 don't affect (sub)expressions that were evaluated during phase 1.

Oct 18 '19 10:10 mdempsky

There is a line Second, the assignments are carried out in left-to-right order in spec. So I called the second phase as the carry-out phase. Would calling it "the second phase" be better?

There ever were the disputes, at least at the time when the issue was created.

Oct 18 '19 11:10 zigo101

I modified it as

.... Second, the assignments are carried out in left-to-right order. The assignments carried-out during the second phase don't affect the expression evaluation results got in the first phase.

Oct 18 '19 12:10 zigo101