go icon indicating copy to clipboard operation
go copied to clipboard

proposal: spec: define return statement's result assignment order?

Open bradfitz opened this issue 2 years ago • 6 comments

A coworker sent out a code review with:

func As(err error) (_ Error, ok bool) {
	var e Error
	return e, errors.As(err, &e)
}

... and during review I said I wasn't sure whether it works. Turns out it worked but it's undefined. It happens to work in cmd/compile and doesn't work in gccgo.

The spec defines assignment order for e.g. a, b = foo(), bar() or a, b = b, a but not for return.

Another example which has different results between gc and gccgo: https://go.dev/play/p/SDMlczFBshC

I propose the spec defines this. My preference would be left to right assignment to results.

But admittedly it might break code like above. (which we fortunately rewrote to be simpler and explicit)

/cc @ianlancetaylor @griesemer (sorry)

bradfitz avatar Feb 01 '23 23:02 bradfitz

CC @mdempsky

ianlancetaylor avatar Feb 01 '23 23:02 ianlancetaylor

I'd think that returns work like assignments (of the result values to the result variables) and thus should behave the same way with respect to evaluation order. (Haven't investigated the spec yet.)

griesemer avatar Feb 01 '23 23:02 griesemer

Note that even x, y = e, f(&e) is undefined as to whether e is evaluated before or after the f(&e) call.

FWIW, cmd/compile already uses the same logic to handle assignment and return statements.

mdempsky avatar Feb 02 '23 00:02 mdempsky

Related: ~~https://github.com/golang/go/issues/27821~~ (Edit: sorry, it should be https://github.com/golang/go/issues/27804)

zigo101 avatar Feb 02 '23 15:02 zigo101

Perhaps this should be documented as an implementation restriction? "A compiler may disallow return statements for which the order of evaluation cannot be determined due to side effects." Or something similar?

beoran avatar Feb 03 '23 09:02 beoran

It can determine for sure, it just doesn't now.

zigo101 avatar Feb 03 '23 13:02 zigo101

I agree that if you have

var e Error
return e, errors.As(err, &e)

then the spec does not guarantee whether the first result is evaluated before or after the call to errors.As.

But if you have:

var e Error
a, b := e, errors.As(err, &e)
return a, b

then I don't believe the spec defines whether a = e happens before or after the call to errors.As either.

In either case, I don't think that's something we can easily change at this point.

rsc avatar Mar 29 '23 16:03 rsc

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

rsc avatar Mar 29 '23 19:03 rsc

We can use an eval function to assure the order.

func Eval[T any](v T) T {
	return v
}

func As(err error) (_ Error, ok bool) {
	var e Error
	return Eval(e), errors.As(err, &e)
}

[update]: similar to https://github.com/golang/go/issues/36449

zigo101 avatar Apr 01 '23 06:04 zigo101

@go101 In my opinion it would be better to not write such confusing code in the first place.

In my opinion we should either fully specify the evaluation order or we should have a vet check to warn about cases in which we both read and write a variable in the same statement. The vet check wouldn't be perfect but it might be able to catch cases like this.

ianlancetaylor avatar Apr 04 '23 01:04 ianlancetaylor

Based on the discussion above, this proposal seems like a likely decline. — rsc for the proposal review group

rsc avatar Apr 06 '23 15:04 rsc

I've used this pattern extensively ever since I learned of it. It did seem odd at first blush. Nowadays it feels similar to an inline defer, if that makes sense.

I'd venture to guess that defining it strictly as left-to-right would break a lot of code; my preference is to officially define the current semantics. I'd be -1 on a vet but mostly because I like the current pattern a lot for very boilerplate HTTP methods (e.g., return body, issuePost(&body)), but I'm not too tied to this either way.

It seems to be that defining the semantics would go a long way towards eliminating confusion. I remember being just as confused by defer having the ability to modify named returns after a return -- but after internalizing that, it's clear and obviously useful.

twmb avatar Apr 06 '23 17:04 twmb

Note that there is not a current semantics. The exact behavior depends on compiler details that can vary from release to release.

ianlancetaylor avatar Apr 06 '23 17:04 ianlancetaylor

Interesting. In that case, I do still recommend officially defining the current behavior as per cmd/compile. Looking through code I have checked out with the rather lazy rg --pcre2 'return (?:[[:alnum:]]+, )*([[:alnum:]]+), .*\([^)]*&\1\b', which isn't comprehensive, shows (besides my own libraries) a few libraries that are vendored in the kubernetes repo as well as the prometheus/client_golang repo:

Vendored in kubernetes, but also their own actual real libraries: https://github.com/Azure/go-autorest/blob/553a90ae65a6a2b18306fa04d7b1625960c5decb/autorest/utility.go#L64-L65 https://github.com/vmware/govmomi/blob/main/vapi/tags/categories.go#L93-L94

Vendored in kubernetes and not in the library anymore: https://github.com/kubernetes/kubernetes/blob/master/vendor/github.com/cilium/ebpf/internal/btf/types.go#L603-L604

Prometheus: https://github.com/prometheus/client_golang/blob/main/api/prometheus/v1/api.go#L894-L895

twmb avatar Apr 06 '23 18:04 twmb

I think that you are suggesting that we define

    return e, errors.As(err, &e)

as being implemented as

  1. Evaluate errors.As(err, &e).
  2. Evaluate e.

The rule would be something like "when multiple expressions appear on the right hand side of an assignment or return statement, first evaluate all the function calls, method calls, and channel operations in left-to-right order. Then evaluate all the other expressions in left-to-right order."

That strikes me as a strange choice to make. Wouldn't it be simpler say "evaluate all expressions in left-to-right order"? If we're going to give a precise definition for this kind of confusing code, shouldn't we aim for the simplest precise definition?

ianlancetaylor avatar Apr 06 '23 22:04 ianlancetaylor

Another way to frame it might be, "when multiple expressions appear on the right hand side of an assignment or return statement, evaluate all operations that can modify an expression in left to right order, and then evaluate all other expressions in left to right order", though it is perhaps less explicit. It seems a bit easier to explain than deferred functions, where the wording is currently "deferred functions are executed after any result parameters are set by that return statement but before the function returns to its caller" (and then a large example to be clearer).

twmb avatar Apr 06 '23 23:04 twmb

Maybe I'm too close to it, but to me those seem like very different things.

A deferred function is executed after the return statement is complete, and after the values have been assigned to the result parameters. That seems clear. There is no ambiguity or confusion about order of evaluation. 1) Evaluate all return expressions (in some order); 2) Assign them all to result parameters; 3) Execute deferred functions.

The order you suggest, which is similar to the one I mentioned, is still more complicated than "evaluate all expressions in left to right order." And my point is really that, if we are going to specify an order of evaluation, why wouldn't we just say "evaluate all expressions in left to right order"? Why would we want to say something that is more complicated than that?

ianlancetaylor avatar Apr 06 '23 23:04 ianlancetaylor

I agree that "evaluate all expressions in left to right order" is much simpler to reason about. Potential points in favor of being more complicated are,

  • People are already relying on this behavior -- albeit, without realizing their code is broken on gccgo and there isn't actually a guarantee their code will work in the future

  • Defining strict left-to-right may result in even more subtle bugs.

If I have

type foo struct {
    copyByVal  string
    copyByAddr map[string]bool
}

func modify(f *foo) bool {
    f.copyByVal = "modified"
    f.copyByAddr["modified"] = true
    return true
}

func problem() (foo, bool) {
    f := foo{copyByAddr: make(map[string]bool)}
    return f, modify(&f)
}

With the current behavior, foo is consistently deeply modified before being returned. With the proposed strict left-to-right, fields in a struct that are non-pointers are returned as is, while fields that are pointers are still modified, resulting in some weird half state modification.

However, defining a vet, or more strictly, refusing to compile this type of return would be a strong signal to developers that they can't do this anymore (and will force ecosystem library churn)

twmb avatar Apr 06 '23 23:04 twmb

Broadly speaking, I think this issue can be generalized to this sentence in the spec, "However, the order of those events compared to the evaluation and indexing of x and the evaluation of y is not specified." and the section following. The current cmd/compile implementation evaluates all functions first. It may be worthwhile to use this issue to specify all of these types of operations (not just return order evaluation).

twmb avatar Apr 07 '23 00:04 twmb

return f, modify(&f)

Such code seems to me to be pretty clearly buggy, in a similar way to code that relied on map iteration order (prior to the randomization update).

An update to the spec might be "The order of those events ... is not specified and is unpredictable. Code which relies on any specific ordering behavior is incorrect. A future version of Go may codify this unpredictability."

08d2 avatar Apr 07 '23 01:04 08d2

The current cmd/compile implementation evaluates all functions first.

This is not true. The current cmd/compile implementation has several self-inconsistencies. See https://github.com/golang/go/issues/36449

zigo101 avatar Apr 07 '23 14:04 zigo101

We can't adopt the rule "run function calls then evaluate expressions", because then if you have code like

return f(), g()

or

x, y := f(), g()

and you have a tool that inlines all the calls to f() (suppose func f() { return global }) you get

return global, g()

or

x, y := global, g()

Before global was evaluated before g(); now it's evaluated after g(). Of course the rewritten code is undefined today, but if we're going to define it, we should define it consistent with manual inlining, which would be a strict left-to-right order.

Return and assignments also have to match, so if we're going to make a rule, it has to be for both, which is why I showed both above.

All that said, I don't think we are considering redoing evaluation order in assignments today. If someone wants to implement strict left-to-right and measure the performance hit, that would be useful data to open a conversation about evaluation order more broadly.

rsc avatar Apr 12 '23 17:04 rsc

Am I reading this correctly,

  • Things right now are undefined
  • There's no willingness to define
  • If things were defined, left-to-right is the way to go from a consideration of function inlining
  • There's an ecosystem that relies on the current behavior of cmd/compile that is actually broken with gccgo

Unfortunately it seems like this would've been valuable to specify in the 1.0 spec. Since it's too late for that -- I'd side with @ianlancetaylor's original comment of adding a vet check, to gradually migrate the existing ecosystem away from the currently assumed behavior.

twmb avatar Apr 12 '23 17:04 twmb

In my opinion we should either fully specify the evaluation order or we should have a vet check to warn about cases in which we both read and write a variable in the same statement. The vet check wouldn't be perfect but it might be able to catch cases like this.

Note that errors.As(err, &e) is a may write. So I think a vet check would likely be for whether a variable is read in a different expression in a return's expression list than an expression in the return when a reference to the variable is taken and that reference may be written to. I am guessing this may be written would be highly over-approximate, and my concern for false-positives would be for whether to include all calls to methods with pointer receivers. FWIW what I am describing would warn on all of the examples in https://github.com/golang/go/issues/58233#issuecomment-1499471060. I think someone would need to try this before we know how reasonable this would be.

timothy-king avatar Apr 12 '23 18:04 timothy-king

I agree that we would have to try it out. I do think that we should get a warning on every piece of code in that comment.

ianlancetaylor avatar Apr 12 '23 19:04 ianlancetaylor

No change in consensus, so declined. — rsc for the proposal review group

rsc avatar Apr 12 '23 20:04 rsc