go
go copied to clipboard
cmd/go,cmd/link: TestScript/build_issue48319 and TestScript/build_plugin_reproducible failing on LUCI gotip-darwin-amd64-longtest builder due to non-reproducible LC_UUID
(It is unclear to me if this is an issue with the test, cmd/go, the compiler/linker, or the builder itself)
Example failure: https://ci.chromium.org/ui/inv/build-8759926960216361809/test-results?sortby=&groupby=
Both tests are failing because they aren't getting a reproducible build.
script_test.go:156: FAIL: testdata/script/build_issue48319.txt:29: cmp -q main.exe main1.exe: main.exe and main1.exe differ
script_test.go:156: FAIL: testdata/script/build_plugin_reproducible.txt:6: cmp -q a.so b.so: a.so and b.so differ
I haven't yet been able to reproduce on a gomote because the LUCI gomote setup doesn't currently set up Xcode properly, so cgo doesn't work (which these tests require).
cc @bcmills @dmitshur @mknyszek @cagedmantis
Workaround to get Xcode on a gomote:
Note: Depending on which machine you get, the mac_toolchain
binary referenced below may be at either /Users/swarming/.swarming/w/ir/tools/bin/mac_toolchain
or /Volumes/Work/s/w/ir/tools/bin/mac_toolchain
.
$ gomote run mpratt-gotip-darwin-amd64-longtest-0 /bin/mkdir /tmp/xcode
$ gomote run mpratt-gotip-darwin-amd64-longtest-0 /Users/swarming/.swarming/w/ir/tools/bin/mac_toolchain install -xcode-version 15a240d -output-dir /tmp/xcode/Xcode.app
$ gomote run mpratt-gotip-darwin-amd64-longtest-0 /usr/bin/sudo xcode-select --switch /tmp/xcode/Xcode.app
Might have something to do with code-signing? (But then why aren't those tests failing on the darwin-amd64-longtest
legacy TryBots too?)
With Xcode installed, this (thankfully) does reproduce (no pun intended):
$ gomote run mpratt-gotip-darwin-amd64-longtest-0 ./go/bin/go test -run=TestScript/build_plugin_reproducible -v cmd/go
# Streaming results from "mpratt-gotip-darwin-amd64-longtest-0" to "/tmp/gomote2019819704/mpratt-gotip-darwin-amd64-longtest-0.stdout"...
=== RUN TestScript
vcs-test.golang.org rerouted to http://127.0.0.1:50941
https://vcs-test.golang.org rerouted to https://127.0.0.1:50942
go test proxy running at GOPROXY=http://127.0.0.1:50943/mod
=== RUN TestScript/build_plugin_reproducible
=== PAUSE TestScript/build_plugin_reproducible
=== CONT TestScript/build_plugin_reproducible
script_test.go:132: 2024-01-03T19:36:00Z
script_test.go:134: $WORK=/Users/swarming/.swarming/w/itsy7ss432/workdir-swarming-task/tmp/cmd-go-test-555795669/tmpdir1260424131/build_plugin_reproducible1539278168
script_test.go:156:
PATH=/Users/swarming/.swarming/w/itsy7ss432/workdir-swarming-task/tmp/cmd-go-test-555795669/tmpdir1260424131/testbin:/Users/swarming/.swarming/w/itsy7ss432/workdir-swarming-task/go/bin:/Users/swarming/.swarming/w/ir/tools/bin:/Users/swarming/.swarming/cipd_cache/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin
HOME=/no-home
CCACHE_DISABLE=1
GOARCH=amd64
TESTGO_GOHOSTARCH=amd64
GOCACHE=/Users/swarming/.swarming/w/itsy7ss432/workdir-swarming-task/gocache
GOCOVERDIR=
GODEBUG=
GOEXE=
GOEXPERIMENT=
GOOS=darwin
TESTGO_GOHOSTOS=darwin
GOPROXY=http://127.0.0.1:50943/mod
GOPRIVATE=
GOROOT=/Users/swarming/.swarming/w/itsy7ss432/workdir-swarming-task/go
GOROOT_FINAL=
GOTRACEBACK=system
TESTGO_GOROOT=/Users/swarming/.swarming/w/itsy7ss432/workdir-swarming-task/go
TESTGO_EXE=/Users/swarming/.swarming/w/itsy7ss432/workdir-swarming-task/tmp/cmd-go-test-555795669/tmpdir1260424131/testbin/go
TESTGO_VCSTEST_HOST=127.0.0.1:50941
TESTGO_VCSTEST_TLS_HOST=127.0.0.1:50942
TESTGO_VCSTEST_CERT=/Users/swarming/.swarming/w/itsy7ss432/workdir-swarming-task/tmp/cmd-go-test-555795669/vcstest63272679/cert.pem
TESTGONETWORK=panic
GOSUMDB=localhost.localdev/sumdb+00000c67+AcTrnkbUA+TU4heY3hkjiSES/DSQniBqIeQ/YppAUtK6
GONOPROXY=
GONOSUMDB=
GOVCS=*:all
devnull=/dev/null
goversion=1.22
CMDGO_TEST_RUN_MAIN=true
HGRCPATH=
GOTOOLCHAIN=auto
newline=
WORK=/Users/swarming/.swarming/w/itsy7ss432/workdir-swarming-task/tmp/cmd-go-test-555795669/tmpdir1260424131/build_plugin_reproducible1539278168
TMPDIR=/Users/swarming/.swarming/w/itsy7ss432/workdir-swarming-task/tmp/cmd-go-test-555795669/tmpdir1260424131/build_plugin_reproducible1539278168/tmp
GOPATH=/Users/swarming/.swarming/w/itsy7ss432/workdir-swarming-task/tmp/cmd-go-test-555795669/tmpdir1260424131/build_plugin_reproducible1539278168/gopath
PWD=/Users/swarming/.swarming/w/itsy7ss432/workdir-swarming-task/tmp/cmd-go-test-555795669/tmpdir1260424131/build_plugin_reproducible1539278168/gopath/src
> [!buildmode:plugin] skip
[condition not met]
> [short] skip
[condition not met]
> go build -trimpath -buildvcs=false -buildmode=plugin -o a.so main.go
> go build -trimpath -buildvcs=false -buildmode=plugin -o b.so main.go
> cmp -q a.so b.so
script_test.go:156: FAIL: testdata/script/build_plugin_reproducible.txt:6: cmp -q a.so b.so: a.so and b.so differ
--- FAIL: TestScript (0.10s)
--- FAIL: TestScript/build_plugin_reproducible (8.78s)
FAIL
FAIL cmd/go 9.089s
FAIL
# Wrote results from "mpratt-gotip-darwin-amd64-longtest-0" to "/tmp/gomote2019819704/mpratt-gotip-darwin-amd64-longtest-0.stdout".
Error running run: unable to execute ./go/bin/go: rpc error: code = Unknown desc = command execution failed: exit status 1
Complete recipe:
Note: Depending on which machine you get, the mac_toolchain
binary referenced below may be at either /Users/swarming/.swarming/w/ir/tools/bin/mac_toolchain
or /Volumes/Work/s/w/ir/tools/bin/mac_toolchain
.
$ export GOROOT=/home/prattmic/src/go/ # set to your GOROOT
$ export GOMOTELUCI=true
$ gomote create gotip-darwin-amd64-longtest
mpratt-gotip-darwin-amd64-longtest-1
$ export INSTANCE=mpratt-gotip-darwin-amd64-longtest-1
$ gomote run ${INSTANCE} /bin/mkdir /tmp/xcode
$ gomote run ${INSTANCE} /Users/swarming/.swarming/w/ir/tools/bin/mac_toolchain install -xcode-version 15a240d -output-dir /tmp/xcode/Xcode.app
$ gomote run ${INSTANCE} /usr/bin/sudo xcode-select --switch /tmp/xcode/Xcode.app
$ gomote push ${INSTANCE}
$ gomote run ${INSTANCE} ./go/src/make.bash
$ gomote run ${INSTANCE} ./go/bin/go test -run=TestScript/build_plugin_reproducible -v cmd/go
The only differences between a.so and b.so are something near the beginning of the file (still investigating) and the Go Build ID:
diff -C 5 a.hex b.hex
*** a.hex Wed Jan 3 12:03:42 2024
--- b.hex Wed Jan 3 12:03:47 2024
***************
*** 117,128 ****
00000740: 0b00 0000 5000 0000 0000 0000 a70b 0000 ....P...........
00000750: a70b 0000 5603 0000 fd0e 0000 3300 0000 ....V.......3...
00000760: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000770: 0000 0000 0000 0000 6048 1700 5801 0000 ........`H..X...
00000780: 0000 0000 0000 0000 0000 0000 0000 0000 ................
! 00000790: 1b00 0000 1800 0000 edfb 2d7d ab6e 374d ..........-}.n7M
! 000007a0: 8eba 7c75 012c c264 3200 0000 2000 0000 ..|u.,.d2... ...
000007b0: 0100 0000 0000 0e00 0000 0e00 0100 0000 ................
000007c0: 0300 0000 0007 f703 2a00 0000 1000 0000 ........*.......
000007d0: 0000 0000 0000 0000 0c00 0000 3800 0000 ............8...
000007e0: 1800 0000 0200 0000 0000 3805 0000 0100 ..........8.....
000007f0: 2f75 7372 2f6c 6962 2f6c 6962 5379 7374 /usr/lib/libSyst
--- 117,128 ----
00000740: 0b00 0000 5000 0000 0000 0000 a70b 0000 ....P...........
00000750: a70b 0000 5603 0000 fd0e 0000 3300 0000 ....V.......3...
00000760: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000770: 0000 0000 0000 0000 6048 1700 5801 0000 ........`H..X...
00000780: 0000 0000 0000 0000 0000 0000 0000 0000 ................
! 00000790: 1b00 0000 1800 0000 f0cb 7393 b5bd 3b76 ..........s...;v
! 000007a0: 9fb5 5f03 dd32 4c8a 3200 0000 2000 0000 .._..2L.2... ...
000007b0: 0100 0000 0000 0e00 0000 0e00 0100 0000 ................
000007c0: 0300 0000 0007 f703 2a00 0000 1000 0000 ........*.......
000007d0: 0000 0000 0000 0000 0c00 0000 3800 0000 ............8...
000007e0: 1800 0000 0200 0000 0000 3805 0000 0100 ..........8.....
000007f0: 2f75 7372 2f6c 6962 2f6c 6962 5379 7374 /usr/lib/libSyst
***************
*** 916,928 ****
00003930: cccc cccc cccc cccc cccc cccc cccc cccc ................
00003940: ff20 476f 2062 7569 6c64 2049 443a 2022 . Go build ID: "
00003950: 4c42 3648 7a64 376b 6c31 6258 726e 7948 LB6Hzd7kl1bXrnyH
00003960: 697a 5859 2f70 2d7a 3839 4146 354e 6136 izXY/p-z89AF5Na6
00003970: 6f31 736e 4466 704a 682f 3644 745f 4f44 o1snDfpJh/6Dt_OD
! 00003980: 4769 7571 452d 7652 4e52 7831 5878 2f66 GiuqE-vRNRx1Xx/f
! 00003990: 4735 6f64 7563 424e 6d4f 7053 6455 4e51 G5oducBNmOpSdUNQ
! 000039a0: 7861 5522 0a20 ffcc cccc cccc cccc cccc xaU". ..........
000039b0: cccc cccc cccc cccc cccc cccc cccc cccc ................
000039c0: 5548 89e5 4883 ec10 4c8b 3dd1 4607 0049 UH..H...L.=.F..I
000039d0: 8b4f 084c 8b3d c646 0700 498b 170f 1f00 .O.L.=.F..I.....
000039e0: 4839 c87d 1b73 3948 c1e0 0448 8b0c 0248 H9.}.s9H...H...H
000039f0: 8b5c 0208 4889 c848 83c4 105d c30f 1f00 .\..H..H...]....
--- 916,928 ----
00003930: cccc cccc cccc cccc cccc cccc cccc cccc ................
00003940: ff20 476f 2062 7569 6c64 2049 443a 2022 . Go build ID: "
00003950: 4c42 3648 7a64 376b 6c31 6258 726e 7948 LB6Hzd7kl1bXrnyH
00003960: 697a 5859 2f70 2d7a 3839 4146 354e 6136 izXY/p-z89AF5Na6
00003970: 6f31 736e 4466 704a 682f 3644 745f 4f44 o1snDfpJh/6Dt_OD
! 00003980: 4769 7571 452d 7652 4e52 7831 5878 2f6d GiuqE-vRNRx1Xx/m
! 00003990: 436f 5971 6470 5854 386a 7a54 6e64 6d4f CoYqdpXT8jzTndmO
! 000039a0: 3038 5022 0a20 ffcc cccc cccc cccc cccc 08P". ..........
000039b0: cccc cccc cccc cccc cccc cccc cccc cccc ................
000039c0: 5548 89e5 4883 ec10 4c8b 3dd1 4607 0049 UH..H...L.=.F..I
000039d0: 8b4f 084c 8b3d c646 0700 498b 170f 1f00 .O.L.=.F..I.....
000039e0: 4839 c87d 1b73 3948 c1e0 0448 8b0c 0248 H9.}.s9H...H...H
000039f0: 8b5c 0208 4889 c848 83c4 105d c30f 1f00 .\..H..H...]....
Based on the otool output, it looks like this other component is the LC_UUID
value: EDFB2D7D-AB6E-374D-8EBA-7C75012CC264
vs F0CB7393-B5BD-3B76-9FB5-5F03DD324C8A
.
I don't know MachO very well, but it seems that this is just another build ID...
Huh. See also https://bugs.chromium.org/p/chromium/issues/detail?id=1068970. 😵💫
The output of GODEBUG=gocachehash=1
is identical for both builds.
Yeah, looks like the LC_UUID
depends on at least the last component of the output file path:
https://github.com/apple-opensource/ld64/blame/e28c028b20af187a16a7161d89e91868a450cadc/src/ld/OutputFile.cpp#L3724-L3733
I'm not sure what _options.buildContextName()
is derived from.
Looks like maybe the RC_RELEASE
environment variable?
(https://github.com/apple-opensource/ld64/blob/e28c028b20af187a16a7161d89e91868a450cadc/src/ld/Options.cpp#L4529-L4530C30)
Thanks for the reference! Looking at the go build -x
output, the last few steps are:
GOROOT_FINAL='$GOROOT' /Volumes/Work/s/w/it_9mfvcff/workdir-swarming-task/go/pkg/tool/darwin_amd64/link -o a.out.so -importcfg $WORK/b001/importcfg.link -installsuffix dynlink -pluginpath plugin/unnamed-bf82aa353b25c4b8a6ab19fdb37f3d07a25be28e -buildmode=plugin -buildid=LB6Hzd7kl1bXrnyHizXY/p-z89AF5Na6o1snDfpJh/6Dt_ODGiuqE-vRNRx1Xx/LB6Hzd7kl1bXrnyHizXY -extld=clang $WORK/b001/_pkg_.a
/Volumes/Work/s/w/it_9mfvcff/workdir-swarming-task/go/pkg/tool/darwin_amd64/buildid -w $WORK/b001/exe/a.out.so # internal
mv $WORK/b001/exe/a.out.so b.so
Running the link step (first line) multiple times, even without changing the path, yields a .so
with different LC_UUID
each time:
$ /Volumes/Work/s/w/it_9mfvcff/workdir-swarming-task/go/pkg/tool/darwin_amd64/link -o a.out.so -importcfg $WORK/b001/importcfg.link -installsuffix dynlink -pluginpath plugin/unnamed-bf82aa353b25c4b8a6ab19fdb37f3d07a25be28e -buildmode=plugin -buildid=LB6Hzd7kl1bXrnyHizXY/p-z89AF5Na6o1snDfpJh/6Dt_ODGiuqE-vRNRx1Xx/LB6Hzd7kl1bXrnyHizXY -extld=clang $WORK/b001/_pkg_.a
$ shasum5.30 a.out.so
9029867749bbecd942c0037526ababa6f0d83932 a.out.so
$ /Volumes/Work/s/w/it_9mfvcff/workdir-swarming-task/go/pkg/tool/darwin_amd64/link -o a.out.so -importcfg $WORK/b001/importcfg.link -installsuffix dynlink -pluginpath plugin/unnamed-bf82aa353b25c4b8a6ab19fdb37f3d07a25be28e -buildmode=plugin -buildid=LB6Hzd7kl1bXrnyHizXY/p-z89AF5Na6o1snDfpJh/6Dt_ODGiuqE-vRNRx1Xx/LB6Hzd7kl1bXrnyHizXY -extld=clang $WORK/b001/_pkg_.a
$ shasum5.30 a.out.so
d76b438421660ffd24d4ca06dc30b3150b0b9fee a.out.so
Diffing the binary shows that the UUID is the only difference (Go Build ID is identical presumably because I'm not running the buildid command).
It doesn't seem to be related to the file paths. cmd/link invokes clang like so:
host link: "clang" "-arch" "x86_64" "-m64" "-Wl,-headerpad,1144" "-Wl,-flat_namespace" "-Wl,-bind_at_load" "-dynamiclib" "-o" "a.out.so" "-Qunused-arguments" "/Volumes/Work/s/w/it_9mfvcff/go-link-202179431/go.o" "/Volumes/Work/s/w/it_9mfvcff/go-link-202179431/000000.o" "/Volumes/Work/s/w/it_9mfvcff/go-link-202179431/000001.o" "/Volumes/Work/s/w/it_9mfvcff/go-link-202179431/000002.o" "/Volumes/Work/s/w/it_9mfvcff/go-link-202179431/000003.o" "/Volumes/Work/s/w/it_9mfvcff/go-link-202179431/000004.o" "/Volumes/Work/s/w/it_9mfvcff/go-link-202179431/000005.o" "/Volumes/Work/s/w/it_9mfvcff/go-link-202179431/000006.o" "/Volumes/Work/s/w/it_9mfvcff/go-link-202179431/000007.o" "/Volumes/Work/s/w/it_9mfvcff/go-link-202179431/000008.o" "/Volumes/Work/s/w/it_9mfvcff/go-link-202179431/000009.o" "-O2" "-g" "-lpthread"
The go-link-202179431
path component changes each iteration, but this can be forced to be the same with -tmpdir /tmp/tmp
:
host link: "clang" "-arch" "x86_64" "-m64" "-Wl,-headerpad,1144" "-Wl,-flat_namespace" "-Wl,-bind_at_load" "-dynamiclib" "-o" "a.out.so" "-Qunused-arguments" "/tmp/tmp/go.o" "/tmp/tmp/000000.o" "/tmp/tmp/000001.o" "/tmp/tmp/000002.o" "/tmp/tmp/000003.o" "/tmp/tmp/000004.o" "/tmp/tmp/000005.o" "/tmp/tmp/000006.o" "/tmp/tmp/000007.o" "/tmp/tmp/000008.o" "/tmp/tmp/000009.o" "-O2" "-g" "-lpthread"
Even with identical paths each time we get different output.
Maybe we can just set the -no_uuid
flag if it exists?
https://github.com/apple-opensource/ld64/blob/e28c028b20af187a16a7161d89e91868a450cadc/src/ld/Options.cpp#L3412-L3415
Perhaps, but I'd like to better understand what is happening. Plus it seems like some users may want the UUID, as Chrome did.
FWIW, the clang
command consistently generates identical output from the same inputs. It seems it is the output of dsymutil
that is differing:
host link: "clang" "-arch" "x86_64" "-m64" "-Wl,-headerpad,1144" "-Wl,-flat_namespace" "-Wl,-bind_at_load" "-dynamiclib" "-o" "a.out.so" "-Qunused-arguments" "/tmp/tmp2/go.o" "/tmp/tmp2/000000.o" "/tmp/tmp2/000001.o" "/tmp/tmp2/000002.o" "/tmp/tmp2/000003.o" "/tmp/tmp2/000004.o" "/tmp/tmp2/000005.o" "/tmp/tmp2/000006.o" "/tmp/tmp2/000007.o" "/tmp/tmp2/000008.o" "/tmp/tmp2/000009.o" "-O2" "-g" "-lpthread"
host link dsymutil: "/tmp/xcode/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/dsymutil" "-f" "a.out.so" "-o" "/tmp/tmp2/go.dwarf"
host link strip: "/tmp/xcode/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/strip" "-S" "a.out.so"
$ for f in /tmp/tmp/*; do shasum5.30 $f; done
60cc7e733a36962e7bc73ee38291e6f37fca8272 /tmp/tmp/000000.o
60cc7e733a36962e7bc73ee38291e6f37fca8272 /tmp/tmp/000001.o
c85aae746d1a1f270a1cea350b377d1b5f9ff376 /tmp/tmp/000002.o
a4741bd785ce981348a908880917a43074de02a7 /tmp/tmp/000003.o
2050abddb774ad2600414caa5596608d5260424c /tmp/tmp/000004.o
160fbd77666d9949ef8b8fa502ad1e665388dad3 /tmp/tmp/000005.o
0c4db5947508c4a8c035484c4ef97182efb572e1 /tmp/tmp/000006.o
e2724d0e3997897da123884a7dd2496d094bf45e /tmp/tmp/000007.o
311978e69c428fbfc059f92eb48ae0a8e3e19d80 /tmp/tmp/000008.o
5fc69d54547370cba80b9d5272b62db3126ff85f /tmp/tmp/000009.o
aa2a253f8abd8c84c3ef27ddfc9c12bc1481f277 /tmp/tmp/go.dwarf
480e9721586f5e764d59826677acff5d6bbe3588 /tmp/tmp/go.o
556b5a818027717b0399b1e94ba268ff147c932e /tmp/tmp/trivial.c
$ for f in /tmp/tmp2/*; do shasum5.30 $f; done
60cc7e733a36962e7bc73ee38291e6f37fca8272 /tmp/tmp2/000000.o
60cc7e733a36962e7bc73ee38291e6f37fca8272 /tmp/tmp2/000001.o
c85aae746d1a1f270a1cea350b377d1b5f9ff376 /tmp/tmp2/000002.o
a4741bd785ce981348a908880917a43074de02a7 /tmp/tmp2/000003.o
2050abddb774ad2600414caa5596608d5260424c /tmp/tmp2/000004.o
160fbd77666d9949ef8b8fa502ad1e665388dad3 /tmp/tmp2/000005.o
0c4db5947508c4a8c035484c4ef97182efb572e1 /tmp/tmp2/000006.o
e2724d0e3997897da123884a7dd2496d094bf45e /tmp/tmp2/000007.o
311978e69c428fbfc059f92eb48ae0a8e3e19d80 /tmp/tmp2/000008.o
5fc69d54547370cba80b9d5272b62db3126ff85f /tmp/tmp2/000009.o
3c3cf42192f5f6427f05b2dceca7c5733e6f1721 /tmp/tmp2/go.dwarf
480e9721586f5e764d59826677acff5d6bbe3588 /tmp/tmp2/go.o
556b5a818027717b0399b1e94ba268ff147c932e /tmp/tmp2/trivial.c
(go.dwarf
differs)
Edit: I'm not 100% certain about dsymutil being at fault here, as I can't seem to reproduce the non-reproducibility when running clang + dsymutil manually.
cc @thanm see https://github.com/golang/go/issues/64947#issuecomment-1875889679 for reproducer instructions
I spent a little while looking at this. What's weird is that the actual DWARF in the two go.dwarf
files is identical-- what is different is (again) the uuid. E.g.
$ llvm-dwarfdump-16 xxx/tmpdir1/go.dwarf > dw1.txt
$ llvm-dwarfdump-16 xxx/tmpdir2/go.dwarf > dw2.txt
$ diff dw1.txt dw2.txt
1c1
< xxx/tmpdir1/go.dwarf: file format Mach-O 64-bit x86-64
---
> xxx/tmpdir2/go.dwarf: file format Mach-O 64-bit x86-64
$
$ llvm-objdump-16 --macho --all-headers xxx/tmpdir1/go.dwarf > h1.txt
$ llvm-objdump-16 --macho --all-headers xxx/tmpdir2/go.dwarf > h2.txt
$ diff h1.txt h2.txt
1c1
< xxx/tmpdir1/go.dwarf:
---
> xxx/tmpdir2/go.dwarf:
3880c3880
< uuid 3BA8085B-DD85-312C-B9AD-2CEDAE928E62
---
> uuid E559C1A0-DDFF-3BD3-8CD8-7652DC367F9F
$
So basically what seems to be happening is that dsymutil is generating a different uuid each time and embedding it into the go.dwarf file, in spite of the fact that the dwarf is the same, hmm.
I will spend a little time digging into the dsymutil source code, maybe I can find out more.
FWIW, the version of Xcode we're installing is 15.0.0. I peeked at the release notes for 15.0.1 and 15.1 and nothing stood out as a fix for this kind of issue, but I'll see if we can get a different version to test.
OK (duh) in fact dsymutil is just faithfully copying the uuid from its input, so the problem here is that clang is generating a different uuid. I'll look into the clang source code instead.
FWIW, it looks like there are more versions of Xcode available to try out, though I haven't tested them:
-
mac_toolchain install -xcode-version 15a240d
: 15.0 -
mac_toolchain install -xcode-version 15A507
: 15.0.1 -
mac_toolchain install -xcode-version 15C65
: 15.1 -
mac_toolchain install -xcode-version 15C5500c
: 15.2 (beta, I guess)
OK, I think I am making some progress here. For a while I thought this might be an ld-prime
problem, but that turned out to be a red herring. In fact it looks like it is a bit simpler than that.
Running the link with -ldflags=-v -tmpdir=/tmp/tmp
I see
# command-line-arguments
HEADER = -H1 -T0x1001000 -R0x1000
host link: "clang" "-arch" "x86_64" "-m64" "-Wl,-headerpad,1144" "-Wl,-flat_namespace" "-Wl,-bind_at_load"
"-dynamiclib" "-o" "/Users/swarming/.swarming/w/itvprlhos9/workdir-swarming-task/tmp/go-build2833294421/b001/exe/a.out.so" "-Qunused-arguments" "/tmp/tmp/go.o" "/tmp/tmp/000000.o"
"/tmp/tmp/000001.o" "/tmp/tmp/000002.o" "/tmp/tmp/000003.o" "/tmp/tmp/000004.o" "/tmp/tmp/000005.o"
"/tmp/tmp/000006.o" "/tmp/tmp/000007.o" "/tmp/tmp/000008.o" "-O2" "-g" "-lpthread" "-ld64"
Note the "-o" output, which incorporates the go build dir go-build2833294421
, which is going to vary from build to build. The problem is that this is being incorporated into the dynamic info in the a.out.so
output, e.g. from the output of llvm-objdump-16 --macho --all-headers
I see:
Load command 4
cmd LC_ID_DYLIB
cmdsize 128
name /Users/swarming/.swarming/w/itvprlhos9/workdir-swarming-task/tmp/go-build1516003297/b001/exe/a.out.so (offset 24)
time stamp 1 Thu Jan 1 00:00:01 1970
current version 0.0.0
compatibility version 0.0.0
and the external linker is almost certainly going to hash this section when creating the build ID.
Not sure what the best approach is to fix this. Also not sure why we aren't seeing similar problems with the older gomotes (I will spin one up and compare).
@thanm, that sounds very similar to an existing reproducibility workaround here: https://cs.opensource.google/go/go/+/master:src/cmd/go/internal/work/gc.go;l=649-663;drc=66b8107a26e515bbe19855d358bdf12bd6326347
Perhaps we need to extend that workaround to more build modes, or take a similar approach when running other commands?
Well phooey, I am afraid I've had a Homer Simpson moment here.
My gomote expired, and I created a new one, but when I started using the new one I didn't update the PATH setting in my script, so it wasn't picking up the correct version of Go. It looks like with LUCI gomotes the location of GOROOT is slightly different each time:
bindir from my first gomote: "/Users/swarming/.swarming/w/ituz4dfd04/workdir-swarming-task/go/bin"
bindir from my second gomote: "/Users/swarming/.swarming/w/itvprlhos9/workdir-swarming-task/go/bin"
Oh well, a learning experience I suppose.
That explains why I was not picking up Cherry's fix (https://go-review.googlesource.com/c/go/+/478196, which extends the workaround that you mention Bryan).
Now I'm back to seeing only a difference in the UUID.
One more important bit of info: problem goes away if I build with -extldflags=-ld_classic
, meaning that this may be another thing we can add to the long list of problems that crop up with "ld-prime" (e.g. issue #61229).
Looking at the setup we have on our old-style gomotes I see:
$ gomote run `cat mote.txt` softwareupdate --history
Display Name Version Date
------------ ------- ----
Command Line Tools for Xcode 14.0 11/07/2022, 16:16:24
Command Line Tools for Xcode 14.1 11/07/2022, 16:16:24
e.g. command line tools, not a complete Xcode installation. For the new LUCI gomotes we are obviously a full Xcode install, and we're using version 15, which defaults to ld-prime.
FWIW, it looks like there are more versions of Xcode available to try out, though I haven't tested them:
mac_toolchain install -xcode-version 15a240d
: 15.0mac_toolchain install -xcode-version 15A507
: 15.0.1mac_toolchain install -xcode-version 15C65
: 15.1mac_toolchain install -xcode-version 15C5500c
: 15.2 (beta, I guess)
I just tested the most recent one (15.2) and it appears to have the same problem. Hmph.
OK, one more update. I can reproduce the problem with just the C compiler, and what I think must be going on is that the name of the output file is being incorporated into the UUID. If I do:
$ clang -arch x86_64 -m64 -Wl,-headerpad,1144 -Wl,-flat_namespace -Wl,-bind_at_load -dynamiclib -o a.so example.cpp
$ clang -arch x86_64 -m64 -Wl,-headerpad,1144 -Wl,-flat_namespace -Wl,-bind_at_load -dynamiclib -o b.so example.cpp
$ llvm-objdump-16 --macho --all-headers a.so > bsh.txt
$ llvm-objdump-16 --macho --all-headers b.so > bsh.txt
then I see a difference, whereas if I instead do
$ clang -arch x86_64 -m64 -Wl,-headerpad,1144 -Wl,-flat_namespace -Wl,-bind_at_load -dynamiclib -o a.so example.cpp
mv a.so b.so
$ clang -arch x86_64 -m64 -Wl,-headerpad,1144 -Wl,-flat_namespace -Wl,-bind_at_load -dynamiclib -o a.so example.cpp
$ llvm-objdump-16 --macho --all-headers a.so > bsh.txt
$ llvm-objdump-16 --macho --all-headers b.so > bsh.txt
The UUIDs are the same (the only thing different in the second example is that both builds target a.so
).
How would we feel about changing the test in question to target the same filename? Or does the current ld-prime behavior not really meet our criteria for reproducible builds?
Huh. I guess it would be ok for the tests to mv
the output file so that they can go build -o
to the same filename, although that seems a bit subtle.
Does the LC_UUID
depend only on the output file's basename, or on the directory path as well? I think it's probably ok for it to depend on the basename, but (especially if the user is building with -trimpath
) we should ensure that it doesn't depend on the current working directory.
Does the
LC_UUID
depend only on the output file's basename
I checked just now and it looks like it is just the output file basename, not the directory. If I run the C compiler building example.cpp once in directory xxx, then do the same compile in directory yyy, I get identical binaries. I'll send a CL, although I agree it is a bit weird.
I poked a bit at the other failure (TestScript/build_issue48319). That one looks like it will require another Go command fix -- since this is not a shared-mode build the "-o" argument being passed to the external linker is a full path. Hence the difference in build IDs.
@bcmills would be make sense to take the code you mentioned before (https://cs.opensource.google/go/go/+/master:src/cmd/go/internal/work/gc.go;l=649-663;drc=66b8107a26e515bbe19855d358bdf12bd6326347) and extend it even farther (e.g. any link being done on Darwin)?
Change https://go.dev/cl/554059 mentions this issue: cmd/go/testdata: tweak build_plugin_reproducible test for Xcode 15
If it is only a test issue, and user's normal "go build" (be default, without any extra weird flags) is still reproducible, I think it is okay to just update the test. Another option is that we (over)write the LC_UUID in the Go linker after C linking, based on the file content (or the Go build ID). (We overwrite the binary for DWARF combining anyway, but that may be changed with #62577.)
For another issue, if it is not a shared object there would be no LC_ID_DYLIB, so it is still the UUID that is affected by the output file path?
If it is only a test issue, and user's normal "go build" (be default, without any extra weird flags) is still reproducible, I think it is okay to just update the test.
If I understand correctly, it's in an awkward grey area: the build is “reproducible” but only if you specify an output filename with the same basename at each invocation.
That is:
$ go build -trimpath -o foo
$ mv foo bar
$ go build -trimpath -o foo
will produce a foo
identical to bar
, but
$ go build -trimpath -o foo
$ go build -trimpath -o bar
will not.
Since -trimpath
is supposed to redact local filenames, and the name of the output file is arguably a local filename, that technically fails reproducibility. On the other hand, it is still the case that repeating exactly the same command — provided that the -o
flag is also the same — should continue to produce the same (reproducible) output bytes independent of the working directory.