go icon indicating copy to clipboard operation
go copied to clipboard

cmd/link: Trampoline insertion breaks DWARF Line Program Table output on Darwin/ARM64

Open jquirke opened this issue 3 years ago • 3 comments

What version of Go are you using (go version)?

$ go version
go version go1.19 darwin/arm64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

GO111MODULE="" GOARCH="arm64" GOBIN="" GOCACHE="/Users/qjeremy/Library/Caches/go-build" GOENV="/Users/qjeremy/Library/Application Support/go/env" GOEXE="" GOEXPERIMENT="" GOFLAGS="" GOHOSTARCH="arm64" GOHOSTOS="darwin" GOINSECURE="" GOMODCACHE="/Users/qjeremy/go-code/pkg/mod" GONOPROXY="" GONOSUMDB="" GOOS="darwin" GOPATH="/Users/qjeremy/go-code" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/usr/local/go" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/usr/local/go/pkg/tool/darwin_arm64" GOVCS="" GOVERSION="go1.19" GCCGO="gccgo" AR="ar" CC="clang" CXX="clang++" CGO_ENABLED="1" GOMOD="/Users/qjeremy/go-code/src/code.uber.internal/go.mod" GOWORK="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/0h/nzm2hwq95tq8d9mpsqmr7c6w0000gn/T/go-build3074789463=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

  1. On Darwin ARM64, create a simple hello world:
package main

import "fmt"

func main() {fmt.Printf("Hello world\n")}
  1. build it forcing trampolines on which are used on ARM64 in practice for much larger jumps of 128MB in larger binaries

$go build -ldflags '-debugtramp=2'

  1. See it runs:
 ./hello
Hello world
  1. Debug it with dlv 1.9.0, set breakpoint on main.main

What did you expect to see?

(dlv) b main.main
Breakpoint 1 set at 0x100ee1500 for main.main() ./hello.go:5
(dlv) c
> main.main() ./hello.go:5 (hits goroutine(1):1 total:1) (PC: 0x100ee1500)
Warning: debugging optimized function
     1:	package main
     2:
     3:	import "fmt"
     4:
=>   5:	func main() {fmt.Printf("Hello world\n")}
(dlv) disass
TEXT main.main(SB) /Users/qjeremy/go-code/src/code.uber.internal/marketplace/driver-pricing/hello/hello.go
	hello.go:5	0x100ee14f0	900b40f9	MOVD 16(R28), R16
	hello.go:5	0x100ee14f4	f1030091	MOVD RSP, R17
	hello.go:5	0x100ee14f8	3f0210eb	CMP R16, R17
	hello.go:5	0x100ee14fc	69020054	BLS 19(PC)
=>	hello.go:5	0x100ee1500*	fe0f1bf8	MOVD.W R30, -80(RSP)
	hello.go:5	0x100ee1504	fd831ff8	MOVD R29, -8(RSP)

What did you see instead?

Line information is removed.

(dlv) b main.main
Breakpoint 1 set at 0x104462270 for main.main() :0
(dlv) c
Stopped at: 0x104462270
=>   1:	no source available
(dlv) disass
TEXT main.main(SB)
	.:0	0x104462260	900b40f9	MOVD 16(R28), R16
	.:0	0x104462264	f1030091	MOVD RSP, R17
	.:0	0x104462268	3f0210eb	CMP R16, R17
	.:0	0x10446226c	69020054	BLS 19(PC)
=>	.:0	0x104462270*	fe0f1bf8	MOVD.W R30, -80(RSP)

Anaylsis

The trampoline path in the linker turns internal symbols (cloneToExternal) that are trampolined into external symbols.

Now, the DWARF generation code, e.g. in writelines skips over external symbols with the explicit stated assumption that they would never have auxsyms, which is not true for trampolines.

Indeed, this proof of concept change test appears to fix the problem

https://github.com/golang/go/compare/master...jquirke:go:Linker_DWARF

Comments

Although the repro forces trampolines; there are many ARM64 binaries at Uber that require relocations more than +/- 124MB, and thus are not debuggable

jquirke avatar Aug 06 '22 21:08 jquirke

Your patch looks good to me. Looks like the PR is blocked due to CLA, but I'm happy to review when it arrives in Gerrit.

thanm avatar Aug 08 '22 16:08 thanm

I think you can just call (*Loader).auxs ( https://cs.opensource.google/go/go/+/master:src/cmd/link/internal/loader/loader.go;l=1854 ), and drop the IsExternal condition.

Also, see https://go.dev/doc/contribute for how to contribute to Go. We cannot do code review on GitHub. Thanks.

cherrymui avatar Aug 08 '22 17:08 cherrymui

Change https://go.dev/cl/422154 mentions this issue: cmd/link: fix broken DWARF LPT on trampoline architectures

gopherbot avatar Aug 08 '22 21:08 gopherbot

This seems like we also need a backport issue for 1.18. Please close the backport issue if it isn't the case.

cagedmantis avatar Aug 17 '22 16:08 cagedmantis

@gopherbot please open a backport to 1.18.

cagedmantis avatar Aug 17 '22 16:08 cagedmantis

Backport issue(s) opened: #54502 (for 1.18).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

gopherbot avatar Aug 17 '22 16:08 gopherbot

is this darwin/arm64 only, or linux/arm64 too?

lizthegrey avatar Aug 17 '22 21:08 lizthegrey

Technically it includes linux/arm64 as well. But generally one would not build programs with -debugtramp=2 flag. In the default setting, trampolines may still be used if the program is very large, which may be affected.

This bug does not affect the correctness of the program, only debug info generation.

cherrymui avatar Aug 17 '22 22:08 cherrymui

This affects all trampoline architectures.

Internally at Uber, we are seeing many SWEs on Apple M1 setups unable to practically debug (line level step through) Go programs that are statically linked over 124MB, which is the trampoline size for Darwin/ARM64.

The debugtramp=2 is just a trivial way to reproduce this. We see real world binaries linked without special flags that require linker trampolines being affected.

jquirke avatar Aug 17 '22 23:08 jquirke