build: run.bash: make test configuration independent of the user's GOENV file
What version of Go are you using (go version)?
$ go version go version go1.18.3 linux/amd64
Does this issue reproduce with the latest release?
Yes, at least for faf4e97200df9415d55b04924ce77599f9e6fdfa
What operating system and processor architecture are you using (go env)?
go env Output
$ go env GO111MODULE="" GOARCH="amd64" GOBIN="" GOCACHE="/home/zyh/.cache/go-build" GOENV="/home/zyh/.config/go/env" GOEXE="" GOEXPERIMENT="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOINSECURE="" GOMODCACHE="/home/zyh/go/pkg/mod" GOOS="linux" GOPATH="/home/zyh/go" GOPROXY="https://goproxy.cn,direct" GOROOT="/usr/local/go" GOSUMDB="sum.golang.google.cn" GOTMPDIR="" GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64" GOVCS="" GOVERSION="go1.18.3" GCCGO="gccgo" GOAMD64="v3" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="/dev/null" GOWORK="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2199054983=/tmp/go-build -gno-record-gcc-switches"
What did you do?
- Clone Go source repo
go env -w GOAMD64=v2ORgo env -w GOAMD64=v3(v1 is ok)cd src && ./all.bash
What did you expect to see?
ALL TESTS PASSED
What did you see instead?
go test proxy running at GOPROXY=http://127.0.0.1:33445/mod
--- FAIL: TestScript (0.01s)
--- FAIL: TestScript/test_relative_import_dash_i (0.21s)
script_test.go:270:
# (2022-07-27T06:51:16Z)
# Relative imports in go test -i (0.000s)
# Run tests outside GOPATH. (0.000s)
# Check that it's safe to pass -i (which installs dependencies in $GOPATH/pkg) to go test. (0.209s)
> ! stale runtime # don't let test -i overwrite runtime
FAIL: testdata/script/test_relative_import_dash_i.txt:8: runtime is unexpectedly stale: not installed but available in build cache
FAIL
FAIL cmd/go 26.846s
I can't reproduce.
TestScript tends to be one of our flaky tests. Is your failure reproducible?
Yes, it can be reproduced stably on two PCs(linux/amd64).
- Intel 11700k: Ubuntu 20.04.2 LTS (GNU/Linux 5.13.0-52-generic x86_64)
- AMD 3700x: Ubuntu 21.10 (GNU/Linux 5.13.0-44-generic x86_64)
On the plus side, note that the -i build flag is being deleted per https://github.com/golang/go/issues/41696, so my attempt at https://go-review.googlesource.com/c/go/+/416094 for Go 1.20 deletes that test script file entirely. So it might not be worthwhile to try to deflake that test given it will be deleted soon.
cc @bcmills
Sorry, I take that back. I can reproduce.
It doesn't reproduce with GOAMD64=v3 ./all.bash. It also doesn't reproduce with go env -w GOAMD64=v3; ./make.bash; ../bin/go test -short cmd/go.
Something very strange going on.
A hunch: perhaps make.bash does not use the value from go env -w GOAMD64=v3 to build the toolchain itself, but it is used for later builds, including go test.
Neat! https://go.dev/cl/356611 is closely related, although I don't think that change will directly fix this issue.
Perhaps TestScript should explicitly set GOAMD64 to the GOAMD64 of the host toolchain? 🤔
Hacking the compiler to ignore the GOAMD64 setting still reproduces. So it probably isn't a code-generation issue.
No, it's probably just a cache-invalidation issue. The build-ID for the archive installed to GOROOT/pkg/linux_amd64/runtime.a doesn't match the build-ID that would be installed when running the test, because the GOAMD64 value that goes into the hash doesn't match.
go env -w GOAMD64=v2ORgo env -w GOAMD64=v3(v1 is ok)
cmd/dist explicitly sets GOENV=off:
https://cs.opensource.google/go/go/+/master:src/cmd/dist/build.go;l=257;drc=78b8e4fbce8a49d7624362242e344968c4fa8382
Under that configuration all.bash should be installing a standard library and toolchain built with GOAMD64=v1 (the default).
The cmd/go test also explicitly sets GOENV=off:
https://cs.opensource.google/go/go/+/master:src/cmd/go/go_test.go;l=276;drc=c2edb2c841149f2e56963071065d52247e24092a
So as far as I can tell, under all.bash everything should be installed with GOAMD64=v1 and run with GOAMD64=v1, and I have no idea how the non-default GOAMD64 setting would be coming into play. 😵💫
If I run:
src$ go env -w GOAMD64=v2
src$ ./all.bash
…
src$ go version -m ../bin/go
I see:
build GOAMD64=v2
However, if I run make.bash instead of all.bash, I see:
build GOAMD64=v1
So I think this is a bug in run.bash.
Indeed, since CL 5792055 run.bash runs go env:
https://cs.opensource.google/go/go/+/master:src/run.bash;l=24;drc=69756b38f25bf72f1040dd7fd243febba89017e6
And it does so without setting GOENV, so that call picks up settings from the user's configuration file.
make.bash passes a --no-rebuild argument to run.bash, and run.bash forwards that to dist test:
https://cs.opensource.google/go/go/+/master:src/all.bash;l=13;drc=95e7897bd81885f1068faa0652cc463c00364f62
However, since CL 75850 dist test astonishingly disregards the --no-rebuild flag and rebuilds anyway when the GO_BUILDER_NAME environment variable is not set:
https://cs.opensource.google/go/go/+/master:src/cmd/dist/test.go;l=143-159;drc=462b78fe7027ef0d2e2b40c3cfd1f5a37d307310
So I think we end up in a state where make.bash installs one GOAMD64 variant of the toolchain and standard library, and run.bash silently overwrites it with another.
But I'm still missing a piece of the puzzle: given the above explanation, run.bash should be reinstalling the entire runtime and toolchain with GOAMD64=v2 (because it propagates GOAMD64 from the environment), and then running go test cmd/go with GOAMD64=v2 as well. So how is the runtime package ending up stale anyway‽
Here's the final piece, uncovered by adding two commands to the test to log some more information:
go env GOAMD64
go version -m $GOROOT${/}bin${/}go
That produces:
> go env GOAMD64
[stdout]
v1
> go version -m $GOROOT${/}bin${/}go
[stdout]
/usr/local/google/home/bcmills/go-review/bin/go: devel go1.19-462b78fe70 Wed Jul 27 20:21:13 2022 +0000
path cmd/go
build -compiler=gc
build CGO_ENABLED=1
build CGO_CFLAGS=
build CGO_CPPFLAGS=
build CGO_CXXFLAGS=
build CGO_LDFLAGS=
build GOARCH=amd64
build GOOS=linux
build GOAMD64=v2
TestScript intentionally does not propagate GOAMD64 from the environment and intentionally does not read the user's environment file, so this means that the go command (installed with GOAMD64=v2) is defaulting to GOAMD64=v1.
That default comes from internal/buildcfg/zbootstrap.go, which is written by make.bash:
https://cs.opensource.google/go/go/+/master:src/cmd/dist/buildruntime.go;l=62;drc=690ac4071fa3e07113bf371c9e74394ab54d6749
Crucially, that generated default is not overwritten by dist test's aggressive reinstallation. So dist test is overwriting the standard library and toolchain based on the environment, but not changing the defaults based on that environment — thus causing the installed toolchain to become stale.
I'm going to call this a run.bash bug, because run.bash is supposed to be testing the same configuration that make.bash installs. I'll file a separate issue for dist test --no-rebuild ignoring the flag and rebuilding things anyway.
Gah, more puzzles! The failure mode I just described above should have been fixed by CL 398061, but the test failure still reproduces after that change. Why‽ 😵💫
Apparently go tool dist is somehow setting the GOAMD64 environment variable, which go tool dist env is then picking up and reporting:
~/go-review/src$ go tool dist env | grep GOAMD64
GOAMD64="v2"
~/go-review/src$ GOROOT=$(go env GOROOT) $(go tool -n dist) env | grep GOAMD64
GOAMD64="v1"
I guess that comes from here: https://cs.opensource.google/go/go/+/master:src/cmd/go/main.go;l=202-213;drc=cc2e7f36bad66c24eecf7868e7ac35c74455b212
Change https://go.dev/cl/420055 mentions this issue: run: set GOENV=off when running 'go tool dist env'