go icon indicating copy to clipboard operation
go copied to clipboard

runtime: SIGTRAMP while profiling arm64 hits new "traceback stuck" assert

Open lizthegrey opened this issue 1 year ago • 3 comments

What version of Go are you using (go version)?

$ go version
go version go1.18.5 linux/arm64

Does this issue reproduce with the latest release?

Have not tested with go1.19 yet.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/home/lizf/.cache/go-build"
GOENV="/home/lizf/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/lizf/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/lizf/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_arm64"
GOVCS=""
GOVERSION="go1.18.5"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3053713255=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Run continuous profiling via scraping /debug/pprof endpoints including mutex and blocking profiles, while performing a workload consuming from Kafka using Sarama and Klauspost Zstd libraries.

  runtime.SetBlockProfileRate(100000)
  runtime.SetMutexProfileFraction(100)

What did you expect to see?

No crashes.

What did you see instead?

The following crashes implicating unexpected return pc for runtime.sigtramp and traceback stuck (assert added in https://go-review.googlesource.com/c/go/+/400575/ by @cherrymui / #52116 which was backported to go1.18.5):

runtime: traceback stuck. pc=0x12eb5 sp=0x4019365840
stack: frame={sp:0x4019365840, fp:0x4019365840} stack=[0x4019364000,0x4019366000)
0x0000004019365740:  0x0000000000ba9494 <github.com/Shopify/sarama.releaseCrc32Field+0x0000000000000034>  0x00000040193657b8 
0x0000004019365750:  0x0000000000bcd8e0 <github.com/Shopify/sarama.(*realDecoder).pop+0x0000000000000060>  0x00000040193657c8 
0x0000004019365760:  0x0000000000000000  0x0000000000000000 
0x0000004019365770:  0x0000000000000700  0x0000000001068260 
0x0000004019365780:  0x00000000010ab360  0x0000000000005701 
0x0000004019365790:  0x0000000000000000  0x0000000000000000 
0x00000040193657a0:  0x0000000000000000  0x0000004000601c00 
0x00000040193657b0:  0x000000401c3c7100  0x00000040193657f8 
0x00000040193657c0:  0x0000000000bd070c <github.com/Shopify/sarama.(*RecordBatch).decode+0x000000000000037c>  0x000000400ff4e0f0 
0x00000040193657d0:  0x0000000000bd064c <github.com/Shopify/sarama.(*RecordBatch).decode+0x00000000000002bc>  0x0000004026c7805f 
0x00000040193657e0:  0x0000000000107de8 <strings.HasSuffix+0x0000000000000058>  0x0000000000000000 
0x00000040193657f0:  0x0000000000000000  0x00000040193658c8 
0x0000004019365800:  0x0000000000bd0eac <github.com/Shopify/sarama.(*Records).decode+0x000000000000009c>  0x0000004017c948c0 
0x0000004019365810:  0x0000004026d6cf92  0x0000000000012eb5 <internal/bytealg.IndexByteString+0x0000000000000005> 
0x0000004019365820:  0x0000000000012eb5 <internal/bytealg.IndexByteString+0x0000000000000005>  0x0000004019365898 
0x0000004019365830:  0x00012ee60001ce60  0x0000000000012eb5 <internal/bytealg.IndexByteString+0x0000000000000005> 
0x0000004019365840: >0x0000000000012eb5 <internal/bytealg.IndexByteString+0x0000000000000005>  0x00000000000000d7 
0x0000004019365850:  0x0000000000000000  0x0000004019365800 
0x0000004019365860:  0x0000000000bd0474 <github.com/Shopify/sarama.(*RecordBatch).decode+0x00000000000000e4>  0x00000040193658b8 
0x0000004019365870:  0x0000000000000000  0x0000000000000000 
0x0000004019365880:  0x0000000000000000  0x0000000000000000 
0x0000004019365890:  0x0000000000000000  0x0000004026d6cf92 
0x00000040193658a0:  0x000000400ff4e0f0  0x0000000000000000 
0x00000040193658b0:  0x0000000000000000  0x0000000000bd09c0 <github.com/Shopify/sarama.(*RecordBatch).decode.func1+0x0000000000000000> 
0x00000040193658c0:  0x000000400ff4e0f0  0x0000004019365918 
0x00000040193658d0:  0x0000000000bb6f1c <github.com/Shopify/sarama.(*FetchResponseBlock).decode+0x000000000000033c>  0x0000004012612c80 
0x00000040193658e0:  0x000000000154bfa8  0x0000004017c948c0 
0x00000040193658f0:  0x0000000000bb6f0c <github.com/Shopify/sarama.(*FetchResponseBlock).decode+0x000000000000032c>  0x0000000000000018 
0x0000004019365900:  0x00000000010d87c0  0x0000004000048301 
0x0000004019365910:  0x0000004017c94a40  0x00000040193659c8 
0x0000004019365920:  0x0000000000bb78e8 <github.com/Shopify/sarama.(*FetchResponse).decode+0x00000000000002a8>  0x0000004009b9fc38 
0x0000004019365930:  0x000000000154bfa8  0x0000004017c948c0 
fatal error: traceback stuck

goroutine 0 [idle]:
runtime: unexpected return pc for runtime.sigtramp called from 0xffff985dc5c0
stack: frame={sp:0x400eacacd0, fp:0x400eacada0} stack=[0x400eac4000,0x400eacc000)
0x000000400eacabd0:  0x000000400eacac28  0x02c595e603246a9e 
0x000000400eacabe0:  0x0000000002c595e5  0x000000400eaba000 
0x000000400eacabf0:  0x0000000000000000  0x000000400eac9fe0 
0x000000400eacac00:  0x0000004013a80000  0x000000400eacada0 
0x000000400eacac10:  0x000000400eacae20  0x000000400eacac98 
0x000000400eacac20:  0x000000000007d58c <runtime.sigtrampgo+0x000000000000001c>  0x000000400000001b 
0x000000400eacac30:  0x0000004000601c00  0x000000400eacac58 
0x000000400eacac40:  0x0000004000603040  0x02bcec7502c540fb 
0x000000400eacac50:  0x00246aa103246aa0  0x0000000000000000 
0x000000400eacac60:  0x0000000000000000  0x0000000000000000 
0x000000400eacac70:  0x0000000000000000  0x0000000000000000 
0x000000400eacac80:  0x0000004000603040  0x000000400eacada0 
0x000000400eacac90:  0x000000400eacae20  0x000000400eacacc8 
0x000000400eacaca0:  0x000000000007cea4 <runtime.sigtramp+0x0000000000000084>  0x000000000000001b 
0x000000400eacacb0:  0x000000400eacada0  0x000000400eacae20 
0x000000400eacacc0:  0x02c5956e03246a9e  0x000000400eacbff0 
0x000000400eacacd0: <0x0000ffff985dc5c0  0x000000400000001b 
0x000000400eacace0:  0x000000400eacada0  0x000000400eacae20 
0x000000400eacacf0:  0x000000000005abba <runtime.selparkcommit+0x000000000000000a>  0x1265726f74537070 
0x000000400eacad00:  0x021210eed9a50809  0x623824220118000a 
0x000000400eacad10:  0x302d333733663636  0x0000000000000015 
0x000000400eacad20:  0x0000000000000071  0x000000400b753e98 
0x000000400eacad30:  0x000000000153a0c0  0x0000004000603040 
0x000000400eacad40:  0x000000400eacacc8  0x3e66376972bea4d0 
0x000000400eacad50:  0xbebbbd41c5d26bf1  0x3f11566aaf25de2c 
0x000000400eacad60:  0xbf66c16c16bebd93  0x3fc5555555555555 
0x000000400eacad70:  0x3fcde3175e2d734a  0x3ffc439d143a5197 
0x000000400eacad80:  0xbfd1a21a8e7d2cb5  0x0000000000000000 
0x000000400eacad90:  0x0404040202020303  0x0000000000000000 
0x000000400eacada0: >0x000000000000001b  0x00000000fffffffe 
0x000000400eacadb0:  0x0000000000000466  0x0000000000000000 
0x000000400eacadc0:  0x0000000000000000  0x0000000000000000 
0x000000400eacadd0:  0x0000000000000000  0x0000000000000000 
0x000000400eacade0:  0x0000000000000000  0x0000000000000000 
0x000000400eacadf0:  0x0000000000000000  0x0000000000000000 
0x000000400eacae00:  0x0000000000000000  0x0000000000000000 
0x000000400eacae10:  0x0000000000000000  0x0000000000000000 
0x000000400eacae20:  0x0000000000000000  0x0000000000000000 
0x000000400eacae30:  0x000000400eac4000  0x0000004000000000 
0x000000400eacae40:  0x0000000000008000  0x0000000000000000 
0x000000400eacae50:  0x0000000000000000  0x0000000000000000 
0x000000400eacae60:  0x0000000000000000  0x0000000000000000 
0x000000400eacae70:  0x00002a1c49103b0c  0x02bcec6f02b9baa6 
0x000000400eacae80:  0x02b9baaf03246a9f  0x02c5956e02bcec75 
0x000000400eacae90:  0x0000000000000000  0x0000004013a805d0 
runtime.throw({0x1262952?, 0x1fae6c0?})
	/usr/local/go/src/runtime/panic.go:992 +0x50
runtime.gentraceback(0x203ad80?, 0x400eac9a90?, 0x400eaca968?, 0x4000603040, 0x0, 0x400eaca968, 0x40, 0x0, 0x0?, 0x6)
	/usr/local/go/src/runtime/traceback.go:488 +0xd98
runtime.sigprof(0x79650, 0x400eacac08?, 0x4013a80010?, 0x4000603040, 0x4000601c00)
	/usr/local/go/src/runtime/proc.go:4507 +0xf0
runtime.sighandler(0x1b?, 0x4000601c00?, 0x400eacac58?, 0x4000603040?)
	/usr/local/go/src/runtime/signal_unix.go:613 +0x514
runtime.sigtrampgo(0x1b, 0x400eacada0, 0x400eacae20)
	/usr/local/go/src/runtime/signal_unix.go:477 +0x164
runtime.sigtrampgo(0x1b, 0x400eacada0, 0x400eacae20)
	<autogenerated>:1 +0x1c
runtime: unexpected return pc for runtime.sigtramp called from 0xffff985dc5c0
stack: frame={sp:0x400eacacd0, fp:0x400eacada0} stack=[0x400eac4000,0x400eacc000)
0x000000400eacabd0:  0x000000400eacac28  0x02c595e603246a9e 
0x000000400eacabe0:  0x0000000002c595e5  0x000000400eaba000 
0x000000400eacabf0:  0x0000000000000000  0x000000400eac9fe0 
0x000000400eacac00:  0x0000004013a80000  0x000000400eacada0 
0x000000400eacac10:  0x000000400eacae20  0x000000400eacac98 
0x000000400eacac20:  0x000000000007d58c <runtime.sigtrampgo+0x000000000000001c>  0x000000400000001b 
0x000000400eacac30:  0x0000004000601c00  0x000000400eacac58 
0x000000400eacac40:  0x0000004000603040  0x02bcec7502c540fb 
0x000000400eacac50:  0x00246aa103246aa0  0x0000000000000000 
0x000000400eacac60:  0x0000000000000000  0x0000000000000000 
0x000000400eacac70:  0x0000000000000000  0x0000000000000000 
0x000000400eacac80:  0x0000004000603040  0x000000400eacada0 
0x000000400eacac90:  0x000000400eacae20  0x000000400eacacc8 
0x000000400eacaca0:  0x000000000007cea4 <runtime.sigtramp+0x0000000000000084>  0x000000000000001b 
0x000000400eacacb0:  0x000000400eacada0  0x000000400eacae20 
0x000000400eacacc0:  0x02c5956e03246a9e  0x000000400eacbff0 
0x000000400eacacd0: <0x0000ffff985dc5c0  0x000000400000001b 
0x000000400eacace0:  0x000000400eacada0  0x000000400eacae20 
0x000000400eacacf0:  0x000000000005abba <runtime.selparkcommit+0x000000000000000a>  0x1265726f74537070 
0x000000400eacad00:  0x021210eed9a50809  0x623824220118000a 
0x000000400eacad10:  0x302d333733663636  0x0000000000000015 
0x000000400eacad20:  0x0000000000000071  0x000000400b753e98 
0x000000400eacad30:  0x000000000153a0c0  0x0000004000603040 
0x000000400eacad40:  0x000000400eacacc8  0x3e66376972bea4d0 
0x000000400eacad50:  0xbebbbd41c5d26bf1  0x3f11566aaf25de2c 
0x000000400eacad60:  0xbf66c16c16bebd93  0x3fc5555555555555 
0x000000400eacad70:  0x3fcde3175e2d734a  0x3ffc439d143a5197 
0x000000400eacad80:  0xbfd1a21a8e7d2cb5  0x0000000000000000 
0x000000400eacad90:  0x0404040202020303  0x0000000000000000 
0x000000400eacada0: >0x000000000000001b  0x00000000fffffffe 
0x000000400eacadb0:  0x0000000000000466  0x0000000000000000 
0x000000400eacadc0:  0x0000000000000000  0x0000000000000000 
0x000000400eacadd0:  0x0000000000000000  0x0000000000000000 
0x000000400eacade0:  0x0000000000000000  0x0000000000000000 
0x000000400eacadf0:  0x0000000000000000  0x0000000000000000 
0x000000400eacae00:  0x0000000000000000  0x0000000000000000 
0x000000400eacae10:  0x0000000000000000  0x0000000000000000 
0x000000400eacae20:  0x0000000000000000  0x0000000000000000 
0x000000400eacae30:  0x000000400eac4000  0x0000004000000000 
0x000000400eacae40:  0x0000000000008000  0x0000000000000000 
0x000000400eacae50:  0x0000000000000000  0x0000000000000000 
0x000000400eacae60:  0x0000000000000000  0x0000000000000000 
0x000000400eacae70:  0x00002a1c49103b0c  0x02bcec6f02b9baa6 
0x000000400eacae80:  0x02b9baaf03246a9f  0x02c5956e02bcec75 
0x000000400eacae90:  0x0000000000000000  0x0000004013a805d0 
runtime.sigtramp()
	/usr/local/go/src/runtime/sys_linux_arm64.s:489 +0x84

goroutine 89 [running]:
runtime: traceback stuck. pc=0x3ff93 sp=0x40048952c0
stack: frame={sp:0x40048952c0, fp:0x40048952c0} stack=[0x4004894000,0x4004896000)
0x00000040048951c0:  0x0000000000000040  0x00000040048951e8 
0x00000040048951d0:  0x00000000005019a4 <github.com/klauspost/compress/zstd.(*sequenceDec).init+0x0000000000000054>  0x0000004004895310 
0x00000040048951e0:  0x00000040008c6000  0x0000004004895228 
0x00000040048951f0:  0x0000000000501b9c <github.com/klauspost/compress/zstd.(*sequenceDecs).initialize+0x000000000000016c>  0x0000004004895338 
0x0000004004895200:  0x00000000004e2b84 <github.com/klauspost/compress/zstd.(*blockDec).decodeCompressed+0x0000000000000164>  0x0000004000530058 
0x0000004004895210:  0x00000040002e22a0  0x00000000006c0001 <text/template/parse.(*endNode).Copy+0x0000000000000001> 
0x0000004004895220:  0x0000000f00800005  0x000001fd00280903 
0x0000004004895230:  0x00000000004e2ebc <github.com/klauspost/compress/zstd.(*blockDec).prepareSequences+0x000000000000025c>  0x000000000003ff93 <runtime.goroutineProfileWithLabels.func4.1+0x0000000000000003> 
0x0000004004895240:  0x00000000000ee1a4 <os.UserHomeDir+0x00000000000000c4>  0x0000000000000811 
0x0000004004895250:  0x0000000000000015  0x0000000000020000 <runtime.advanceEvacuationMark+0x0000000000000000> 
0x0000004004895260:  0x00000000006c0001 <text/template/parse.(*endNode).Copy+0x0000000000000001>  0x0000000000000000 
0x0000004004895270:  0x0000000000000006  0x0000000f00800005 
0x0000004004895280:  0x000001fd00280903  0x0000000000000000 
0x0000004004895290:  0x000000400720ea00  0x0000004004895338 
0x00000040048952a0:  0x00000000004e2a78 <github.com/klauspost/compress/zstd.(*blockDec).decodeCompressed+0x0000000000000058>  0x0000004000530058 
0x00000040048952b0:  0x00000040002e22a0  0x00000000000ee1a4 <os.UserHomeDir+0x00000000000000c4> 
0x00000040048952c0: >0x000000000003ff93 <runtime.goroutineProfileWithLabels.func4.1+0x0000000000000003>  0x0000000000020000 <runtime.advanceEvacuationMark+0x0000000000000000> 
0x00000040048952d0:  0x0000004009f42000  0x00000040002e22a0 
0x00000040048952e0:  0x0000000000001a81  0x0000004009f42000 
0x00000040048952f0:  0x0000004005b17500  0x000000400720ea00 
0x0000004004895300:  0x0000004004a28a00  0x00000040002e22a0 
0x0000004004895310:  0x0000004003bcf000  0x000000400885150c 
0x0000004004895320:  0x000000000000467b  0x000000000002ba01 <runtime.gcStart+0x0000000000000701> 
0x0000004004895330:  0x000000000000002e  0x0000004004895398 
0x0000004004895340:  0x00000000004e1974 <github.com/klauspost/compress/zstd.(*blockDec).decodeBuf+0x0000000000000284>  0x0000004000530058 
0x0000004004895350:  0x0000000000000000  0x0000000000000000 
0x0000004004895360:  0x0000000000000000  0x0000004000530050 
0x0000004004895370:  0x0000000000000000  0x0000000000000000 
0x0000004004895380:  0x0000000000000000  0x0000000000020000 <runtime.advanceEvacuationMark+0x0000000000000000> 
0x0000004004895390:  0x0000000000000000  0x0000004004895458 
0x00000040048953a0:  0x00000000004fd2cc <github.com/klauspost/compress/zstd.(*frameDec).runDecoder+0x000000000000016c>  0x000000400058c000 
0x00000040048953b0:  0x0000004000530050  0x0000000000000000 
fatal error: traceback stuck
goroutine 0 [idle]:
runtime: unexpected return pc for runtime.sigtramp called from 0xffffabfd25c0
stack: frame={sp:0x400000acd0, fp:0x400000ada0} stack=[0x4000004000,0x400000c000)
0x000000400000abd0:  0x000000400000ac28  0x0000000000000000 
0x000000400000abe0:  0x0000000000000000  0x0000004000002000 
0x000000400000abf0:  0x0000000000000000  0x0000000000000000 
0x000000400000ac00:  0x0000000000000000  0x000000400000ada0 
0x000000400000ac10:  0x000000400000ae20  0x000000400000ac98 
0x000000400000ac20:  0x000000000007d58c <runtime.sigtrampgo+0x000000000000001c>  0x000000400000001b 
0x000000400000ac30:  0x0000000002046680  0x000000400000ac58 
0x000000400000ac40:  0x0000004000521d40  0x0000000000000000 
0x000000400000ac50:  0x0000000000000000  0x0000000000000000 
0x000000400000ac60:  0x0000000000000000  0x0000000000000000 
0x000000400000ac70:  0x0000000000000000  0x0000000000000000 
0x000000400000ac80:  0x0000004000521d40  0x000000400000ada0 
0x000000400000ac90:  0x000000400000ae20  0x000000400000acc8 
0x000000400000aca0:  0x000000000007cea4 <runtime.sigtramp+0x0000000000000084>  0x000000000000001b 
0x000000400000acb0:  0x000000400000ada0  0x000000400000ae20 
0x000000400000acc0:  0x0000000000000000  0x000000400000bff0 
0x000000400000acd0: <0x0000ffffabfd25c0  0x000000000000001b 
0x000000400000ace0:  0x000000400000ada0  0x000000400000ae20 
0x000000400000acf0:  0x0000000000000009  0x0000000000000015 
0x000000400000ad00:  0x0000000000000000  0x00000040008c9fab 
0x000000400000ad10:  0x000000000001ff93 <runtime.evacuate+0x00000000000004b3>  0x0000000000000009 
0x000000400000ad20:  0x000000000000032d  0x00000040048953d0 
0x000000400000ad30:  0x0000000002078100  0x0000004000521d40 
0x000000400000ad40:  0x000000400000acc8  0x3e66376972bea4d0 
0x000000400000ad50:  0xbebbbd41c5d26bf1  0x3f11566aaf25de2c 
0x000000400000ad60:  0xbf66c16c16bebd93  0x3fc5555555555555 
0x000000400000ad70:  0x3fb8fcd2e18d7b62  0x3ffe7032d1e7284a 
0x000000400000ad80:  0xbfbab6204490a1e0  0x0000000000000000 
0x000000400000ad90:  0x0000000000000000  0x0000000000000000 
0x000000400000ada0: >0x000000000000001b  0x00000000fffffffe 
0x000000400000adb0:  0x0000000000004d16  0x0000000000000000 
0x000000400000adc0:  0x0000000000000000  0x0000000000000000 
0x000000400000add0:  0x0000000000000000  0x0000000000000000 
0x000000400000ade0:  0x0000000000000000  0x0000000000000000 
0x000000400000adf0:  0x0000000000000000  0x0000000000000000 
0x000000400000ae00:  0x0000000000000000  0x0000000000000000 
0x000000400000ae10:  0x0000000000000000  0x0000000000000000 
0x000000400000ae20:  0x0000000000000000  0x0000000000000000 
0x000000400000ae30:  0x0000004000004000  0x0000000000000000 
0x000000400000ae40:  0x0000000000008000  0x0000000000000000 
0x000000400000ae50:  0x0000000000000000  0x0000000000000000 
0x000000400000ae60:  0x0000000000000000  0x0000000000000000 
0x000000400000ae70:  0x0000000000000000  0x0000000000000000 
0x000000400000ae80:  0x0000000000000000  0x0000000000000000 
0x000000400000ae90:  0x0000000000000000  0x0000000000000000 
runtime.throw({0x1275238?, 0x1fbe760?})
    /usr/local/go/src/runtime/panic.go:992 +0x50
runtime.gentraceback(0x204afc0?, 0x0?, 0x400000a968?, 0x4000521d40, 0x0, 0x400000a968, 0x40, 0x0, 0x0?, 0x6)
    /usr/local/go/src/runtime/traceback.go:488 +0xd98
runtime.sigprof(0x79650, 0x400000ac08?, 0x0?, 0x4000521d40, 0x2046680)
    /usr/local/go/src/runtime/proc.go:4507 +0xf0
runtime.sighandler(0x1b?, 0x2046680?, 0x400000ac58?, 0x4000521d40?)
    /usr/local/go/src/runtime/signal_unix.go:613 +0x514
runtime.sigtrampgo(0x1b, 0x400000ada0, 0x400000ae20)
    /usr/local/go/src/runtime/signal_unix.go:477 +0x164
runtime.sigtrampgo(0x1b, 0x400000ada0, 0x400000ae20)
    <autogenerated>:1 +0x1c
runtime: unexpected return pc for runtime.sigtramp called from 0xffffabfd25c0
stack: frame={sp:0x400000acd0, fp:0x400000ada0} stack=[0x4000004000,0x400000c000)
0x000000400000abd0:  0x000000400000ac28  0x0000000000000000 
0x000000400000abe0:  0x0000000000000000  0x0000004000002000 
0x000000400000abf0:  0x0000000000000000  0x0000000000000000 
0x000000400000ac00:  0x0000000000000000  0x000000400000ada0 
0x000000400000ac10:  0x000000400000ae20  0x000000400000ac98 
0x000000400000ac20:  0x000000000007d58c <runtime.sigtrampgo+0x000000000000001c>  0x000000400000001b 
0x000000400000ac30:  0x0000000002046680  0x000000400000ac58 
0x000000400000ac40:  0x0000004000521d40  0x0000000000000000 
0x000000400000ac50:  0x0000000000000000  0x0000000000000000 
0x000000400000ac60:  0x0000000000000000  0x0000000000000000 
0x000000400000ac70:  0x0000000000000000  0x0000000000000000 
0x000000400000ac80:  0x0000004000521d40  0x000000400000ada0 
0x000000400000ac90:  0x000000400000ae20  0x000000400000acc8 
0x000000400000aca0:  0x000000000007cea4 <runtime.sigtramp+0x0000000000000084>  0x000000000000001b 
0x000000400000acb0:  0x000000400000ada0  0x000000400000ae20 
0x000000400000acc0:  0x0000000000000000  0x000000400000bff0 
0x000000400000acd0: <0x0000ffffabfd25c0  0x000000000000001b 
0x000000400000ace0:  0x000000400000ada0  0x000000400000ae20 
0x000000400000acf0:  0x0000000000000009  0x0000000000000015 
0x000000400000ad00:  0x0000000000000000  0x00000040008c9fab 
0x000000400000ad10:  0x000000000001ff93 <runtime.evacuate+0x00000000000004b3>  0x0000000000000009 
0x000000400000ad20:  0x000000000000032d  0x00000040048953d0 
0x000000400000ad30:  0x0000000002078100  0x0000004000521d40 
0x000000400000ad40:  0x000000400000acc8  0x3e66376972bea4d0 
0x000000400000ad50:  0xbebbbd41c5d26bf1  0x3f11566aaf25de2c 
0x000000400000ad60:  0xbf66c16c16bebd93  0x3fc5555555555555 
0x000000400000ad70:  0x3fb8fcd2e18d7b62  0x3ffe7032d1e7284a 
0x000000400000ad80:  0xbfbab6204490a1e0  0x0000000000000000 
0x000000400000ad90:  0x0000000000000000  0x0000000000000000 
0x000000400000ada0: >0x000000000000001b  0x00000000fffffffe 
0x000000400000adb0:  0x0000000000004d16  0x0000000000000000 
0x000000400000adc0:  0x0000000000000000  0x0000000000000000 
0x000000400000add0:  0x0000000000000000  0x0000000000000000 
0x000000400000ade0:  0x0000000000000000  0x0000000000000000 
0x000000400000adf0:  0x0000000000000000  0x0000000000000000 
0x000000400000ae00:  0x0000000000000000  0x0000000000000000 
0x000000400000ae10:  0x0000000000000000  0x0000000000000000 
0x000000400000ae20:  0x0000000000000000  0x0000000000000000 
0x000000400000ae30:  0x0000004000004000  0x0000000000000000 
0x000000400000ae40:  0x0000000000008000  0x0000000000000000 
0x000000400000ae50:  0x0000000000000000  0x0000000000000000 
0x000000400000ae60:  0x0000000000000000  0x0000000000000000 
0x000000400000ae70:  0x0000000000000000  0x0000000000000000 
0x000000400000ae80:  0x0000000000000000  0x0000000000000000 
0x000000400000ae90:  0x0000000000000000  0x0000000000000000 
runtime.sigtramp()
    /usr/local/go/src/runtime/sys_linux_arm64.s:489 +0x84
goroutine 197 [running]:

lizthegrey avatar Aug 08 '22 04:08 lizthegrey

Given that you commented on #52116, I assume that this application was also suffering from that bug before this (seemingly incomplete) fix?

prattmic avatar Aug 08 '22 13:08 prattmic

This stack looks weird. 0x12eb5 is not even an aligned PC (ARM64 instructions are always 4 bytes). Why would it appear on the stack? 0x3ff93 from another stack trace is similarly weird.

Maybe they are not PCs? Given that a nearby stack word has content 0x00012ee60001ce60, maybe 0x12eb5, 0x12ee6, 0x1ce60 are not PCs but some (32-bit?) integers on stack?

On both stack traces the SIGPROF lands on PC 0x79650. What is that PC? Is it a Go function? Thanks.

cherrymui avatar Aug 08 '22 16:08 cherrymui

Given that you commented on #52116, I assume that this application was also suffering from that bug before this (seemingly incomplete) fix?

~It was. But we backported the fix and it was working fine. The new change for us is turning on not just CPU/mem profiling, but block profiling too.~

lizthegrey avatar Aug 08 '22 21:08 lizthegrey

This stack looks weird. 0x12eb5 is not even an aligned PC (ARM64 instructions are always 4 bytes). Why would it appear on the stack? 0x3ff93 from another stack trace is similarly weird.

Maybe they are not PCs? Given that a nearby stack word has content 0x00012ee60001ce60, maybe 0x12eb5, 0x12ee6, 0x1ce60 are not PCs but some (32-bit?) integers on stack?

On both stack traces the SIGPROF lands on PC 0x79650. What is that PC? Is it a Go function? Thanks.

$ go tool addr2line retriever
0x79650
runtime.morestack_noctxt
/usr/local/go/src/runtime/asm_arm64.s:321

lizthegrey avatar Aug 08 '22 21:08 lizthegrey

Given that you commented on #52116, I assume that this application was also suffering from that bug before this (seemingly incomplete) fix?

It was. But we backported the fix and it was working fine. The new change for us is turning on not just CPU/mem profiling, but block profiling too.

Addendum: I just trawled errors, and indeed, we do have significant numbers of "traceback stuck" dating back multiple months, just in applications that were able to restart after crashing without coming to my attention.

lizthegrey avatar Aug 08 '22 23:08 lizthegrey

Here are some additional crashes from other applications:

$ go tool addr2line barbet
0x79410
0x79414
runtime.morestack_noctxt
/usr/local/go/src/runtime/asm_arm64.s:321
runtime.morestack_noctxt
/usr/local/go/src/runtime/asm_arm64.s:322
$ go tool addr2line beagle
0x79410
runtime.morestack_noctxt
/usr/local/go/src/runtime/asm_arm64.s:321
$ go tool addr2line shepherd
0x790a0
0x791d0
0x791d4
runtime.systemstack
/usr/local/go/src/runtime/asm_arm64.s:204
runtime.morestack_noctxt
/usr/local/go/src/runtime/asm_arm64.s:321
runtime.morestack_noctxt
/usr/local/go/src/runtime/asm_arm64.s:322

lizthegrey avatar Aug 09 '22 00:08 lizthegrey

https://github.com/golang/go/blob/go1.18.5/src/runtime/asm_arm64.s#L321 -- culprit lines here.

similarity to https://github.com/golang/go/issues/52829, does this need to not have NOFRAME ie https://go-review.googlesource.com/c/go/+/405482/?

lizthegrey avatar Aug 09 '22 02:08 lizthegrey

Also repros with go1.19.0

runtime: traceback stuck. pc=0x58e9e0 sp=0x4035cfada0
stack: frame={sp:0x4035cfada0, fp:0x4035cfada0} stack=[0x4035cf8000,0x4035cfc000)
0x0000004035cfaca0:  0x0000004047098bd0  0x000000010001aae8 
0x0000004035cfacb0:  0xc90000000001bfb0  0x0000ffff94c3f878 
0x0000004035cfacc0:  0xc9c4a878b22454ee  0x0000000000000006 
0x0000004035cfacd0:  0x000000000058db7c <github.com/honeycombio/libhoney-go.(*fieldHolder).AddField+0x000000000000004c>  0x000000403cd55520 
0x0000004035cface0:  0x000000000058dbf0 <github.com/honeycombio/libhoney-go.(*fieldHolder).AddField+0x00000000000000c0>  0x0000004035cfad38 
0x0000004035cfacf0:  0x00000040470acb20  0x0000004035cfad58 
0x0000004035cfad00:  0x000000000058e980 <github.com/honeycombio/libhoney-go.(*Event).AddField+0x0000000000000110>  0x0000000000fcae00 
0x0000004035cfad10:  0x0000004047098bd0  0x00000040469ff6c8 
0x0000004035cfad20:  0x0000000000000014  0x0100000000000014 
0x0000004035cfad30:  0x00000040470b2248  0x0000000000ebb3a0 
0x0000004035cfad40:  0x000000000058dc50 <github.com/honeycombio/libhoney-go.(*fieldHolder).AddField.func1+0x0000000000000000>  0x00000040470b2248 
0x0000004035cfad50:  0x0000004035cfad40  0x0000004035cfadb8 
0x0000004035cfad60:  0x00000000005a5898 <github.com/honeycombio/beeline-go/trace.(*Span).AddField+0x0000000000000158>  0x00000040470b2240 
0x0000004035cfad70:  0x00000040469ff6c8  0x0000000000000014 
0x0000004035cfad80:  0x0000000000ebb3a0  0x0000000001ed2ac8 
0x0000004035cfad90:  0x01000000005a5840  0x0000000000f9a880 
0x0000004035cfada0: >0x000000000058e9e0 <github.com/honeycombio/libhoney-go.(*Event).AddField.func1+0x0000000000000000>  0x00000040470b226c 
0x0000004035cfadb0:  0x0000004035cfada0  0x0000004035cfae28 
0x0000004035cfadc0:  0x00000000005a7a4c <github.com/honeycombio/beeline-go.AddField+0x00000000000000cc>  0x0000ffff94c3f878 
0x0000004035cfadd0:  0x00000040469ff6c8  0x0000000001ed2ac8 
0x0000004035cfade0:  0x0000000000ebb3a0  0x0000000001ed2ac8 
0x0000004035cfadf0:  0x0000004035cfae08  0x0000004035cfae28 
0x0000004035cfae00:  0x01000000005a7a00  0x0000000000f9a880 
0x0000004035cfae10:  0x00000000005a5900 <github.com/honeycombio/beeline-go/trace.(*Span).AddField.func1+0x0000000000000000>  0x0000004043dc5688 
0x0000004035cfae20:  0x0000004035cfae10  0x0000004035cfae88 
0x0000004035cfae30:  0x0000000000d79884 <github.com/honeycombio/hound/cmd/shepherd/app.(*App).getSchema+0x00000000000000f4>  0x0000004043dc5600 
0x0000004035cfae40:  0x00000040469ff6c8  0x0000000000000014 
0x0000004035cfae50:  0x0000000000ebb3a0  0x0000000001ed2ac8 
0x0000004035cfae60:  0x000000400015aae0  0x0000000000000014 
0x0000004035cfae70:  0x0000000000ebb3a0  0x0000004043dc5600 
0x0000004035cfae80:  0x00000040469ff6c8  0x0000004035cfaf48 
0x0000004035cfae90:  0x0000000000d755bc <github.com/honeycombio/hound/cmd/shepherd/app.(*App).batchProcess.func1+0x000000000000043c>  0x0000000001491d40 
fatal error: traceback stuck

goroutine 0 [idle]:
runtime.throw({0x11b3553?, 0x1ebf180?})
	/usr/local/go/src/runtime/panic.go:1047 +0x40 fp=0x40009ac610 sp=0x40009ac5e0 pc=0x490f0
runtime.gentraceback(0x1f3c7e0?, 0x0?, 0x40009ac9a8?, 0x403cd55520, 0x0, 0x40009ac9a8, 0x40, 0x0, 0x0?, 0x6)
	/usr/local/go/src/runtime/traceback.go:488 +0xde4 fp=0x40009ac940 sp=0x40009ac610 pc=0x703f4
runtime.sigprof(0x7abe0, 0x40009acc48?, 0x0?, 0x403cd55520, 0x40009a4000)
	/usr/local/go/src/runtime/proc.go:4567 +0xe8 fp=0x40009acbb0 sp=0x40009ac940 pc=0x55618
runtime.sighandler(0x1b?, 0x40009a4000?, 0x40009acc98?, 0x40005e4000?)
	/usr/local/go/src/runtime/signal_unix.go:629 +0x574 fp=0x40009acc60 sp=0x40009acbb0 pc=0x60334
runtime.sigtrampgo(0x1b, 0x40009acda0, 0x40009ace20)
	/usr/local/go/src/runtime/signal_unix.go:479 +0x164 fp=0x40009acce0 sp=0x40009acc60 pc=0x5f9c4
runtime.sigtramp()
	/usr/local/go/src/runtime/sys_linux_arm64.s:471 +0x58 fp=0x40009acda0 sp=0x40009acce0 pc=0x7e228

goroutine 1503875 [running]:
runtime: traceback stuck. pc=0x58e9e0 sp=0x400ffaeda0
stack: frame={sp:0x400ffaeda0, fp:0x400ffaeda0} stack=[0x400ffac000,0x400ffb0000)
0x000000400ffaeca0:  0x0000004046ce2240  0x000000010001f0b8 
0x000000400ffaecb0:  0x7300000001243730  0x0000ffff9d953948 
0x000000400ffaecc0:  0xc8f36a7a4ff431e7  0x0000000000000007 
0x000000400ffaecd0:  0x000000000058db7c <github.com/honeycombio/libhoney-go.(*fieldHolder).AddField+0x000000000000004c>  0x000000400cfa5520 
0x000000400ffaece0:  0x000000000058dc00 <github.com/honeycombio/libhoney-go.(*fieldHolder).AddField+0x00000000000000d0>  0x000000400ffaed38 
0x000000400ffaecf0:  0x0000000000065708 <runtime.rawstringtmp+0x0000000000000048>  0x000000400ffaed58 
0x000000400ffaed00:  0x000000000058e980 <github.com/honeycombio/libhoney-go.(*Event).AddField+0x0000000000000110>  0x0000000000fcae00 
0x000000400ffaed10:  0x0000004046ce2240  0x0000004046110d50 
0x000000400ffaed20:  0x0000000000000016  0x0100000000000016 
0x000000400ffaed30:  0x00000040453eaf68  0x0000000000eccea0 
0x000000400ffaed40:  0x000000000058dc50 <github.com/honeycombio/libhoney-go.(*fieldHolder).AddField.func1+0x0000000000000000>  0x00000040453eaf68 
0x000000400ffaed50:  0x000000400ffaed40  0x000000400ffaedb8 
0x000000400ffaed60:  0x00000000005a5898 <github.com/honeycombio/beeline-go/trace.(*Span).AddField+0x0000000000000158>  0x00000040453eaf60 
0x000000400ffaed70:  0x0000004046110d50  0x0000000000000016 
0x000000400ffaed80:  0x0000000000eccea0  0x000000000147d320 
0x000000400ffaed90:  0x01000000005a5840  0x0000000000f9a880 
0x000000400ffaeda0: >0x000000000058e9e0 <github.com/honeycombio/libhoney-go.(*Event).AddField.func1+0x0000000000000000>  0x00000040453eaf8c 
0x000000400ffaedb0:  0x000000400ffaeda0  0x000000400ffaee28 
0x000000400ffaedc0:  0x00000000005a7a4c <github.com/honeycombio/beeline-go.AddField+0x00000000000000cc>  0x0000ffff9d953948 
0x000000400ffaedd0:  0x0000004046110d50  0x000000000147d320 
0x000000400ffaede0:  0x0000000000eccea0  0x000000000147d320 
0x000000400ffaedf0:  0x000000400ffaee08  0x000000400ffaee28 
0x000000400ffaee00:  0x01000000005a7a00  0x0000000000f9a880 
0x000000400ffaee10:  0x00000000005a5900 <github.com/honeycombio/beeline-go/trace.(*Span).AddField.func1+0x0000000000000000>  0x00000040453c2b88 
0x000000400ffaee20:  0x000000400ffaee10  0x000000400ffaee88 
0x000000400ffaee30:  0x0000000000d79a6c <github.com/honeycombio/hound/cmd/shepherd/app.(*App).getSchema+0x00000000000002dc>  0x00000040453c2b00 
0x000000400ffaee40:  0x0000004046110d50  0x0000000000000016 
0x000000400ffaee50:  0x0000000000eccea0  0x000000000147d320 
0x000000400ffaee60:  0x00000040006af0b0  0x0000000000000016 
0x000000400ffaee70:  0x0000000000eccea0  0x00000040453c2b00 
0x000000400ffaee80:  0x0000004046110d50  0x000000400ffaef48 
0x000000400ffaee90:  0x0000000000d755bc <github.com/honeycombio/hound/cmd/shepherd/app.(*App).batchProcess.func1+0x000000000000043c>  0x0000000001491d40 
fatal error: traceback stuck

goroutine 0 [idle]:
runtime.throw({0x11b3553?, 0x1ebf180?})
	/usr/local/go/src/runtime/panic.go:1047 +0x40 fp=0x400078a610 sp=0x400078a5e0 pc=0x490f0
runtime.gentraceback(0x1f3c7e0?, 0x0?, 0x400078a9a8?, 0x400cfa5520, 0x0, 0x400078a9a8, 0x40, 0x0, 0x0?, 0x6)
	/usr/local/go/src/runtime/traceback.go:488 +0xde4 fp=0x400078a940 sp=0x400078a610 pc=0x703f4
runtime.sigprof(0x7abe4, 0x400078ac48?, 0x0?, 0x400cfa5520, 0x4000780000)
	/usr/local/go/src/runtime/proc.go:4567 +0xe8 fp=0x400078abb0 sp=0x400078a940 pc=0x55618
runtime.sighandler(0x1b?, 0x4000780000?, 0x400078ac98?, 0x40007821a0?)
	/usr/local/go/src/runtime/signal_unix.go:629 +0x574 fp=0x400078ac60 sp=0x400078abb0 pc=0x60334
runtime.sigtrampgo(0x1b, 0x400078ada0, 0x400078ae20)
	/usr/local/go/src/runtime/signal_unix.go:479 +0x164 fp=0x400078ace0 sp=0x400078ac60 pc=0x5f9c4
runtime.sigtramp()
	/usr/local/go/src/runtime/sys_linux_arm64.s:471 +0x58 fp=0x400078ada0 sp=0x400078ace0 pc=0x7e228

goroutine 16981043 [running]:
go tool addr2line shepherd
0x7abe4
0x7abe0
runtime.morestack_noctxt
/usr/local/go/src/runtime/asm_arm64.s:324
runtime.morestack_noctxt
/usr/local/go/src/runtime/asm_arm64.s:323

https://github.com/golang/go/blob/go1.19/src/runtime/asm_arm64.s#L323-L324

lizthegrey avatar Aug 09 '22 15:08 lizthegrey

Thanks for the details @lizthegrey !

similarity to https://github.com/golang/go/issues/52829, does this need to not have NOFRAME ie https://go-review.googlesource.com/c/go/+/405482/?

I don't think this is necessary. The ARM32 function contains a call in some configurations, whereas the ARM64 function should not contain a call.

So the signal lands at the entry of morestack_noctxt or systemstack. At this point the stack switch hasn't occurred so it should be able to traceback normally. I don't immediately see what is wrong. I'll keep looking.

If you need a workaround you can try comment out the "traceback stuck" throw. The profile sample's stacks may be inaccurate (which probably was the case before that CL) but it probably won't crash.

cherrymui avatar Aug 09 '22 21:08 cherrymui

Need any additional debugging info? Any way that I can help?

lizthegrey avatar Aug 14 '22 15:08 lizthegrey

We have special cases for morestack and systemstack here, but I don't believe they are relevant because (a) funcID_morestack only applies to morestack, not morestack_noctxt, and (b) we should only hits these if we already switched stacks, which is not the case here.

@lizthegrey is this readily reproducible (ideally OSS, but I'm also curious if you could manually reproduce)? Or only occurs in prod?

prattmic avatar Aug 16 '22 15:08 prattmic

It reproduces multiple times per day in all of our environments. Sadly I'm not sure I can reduce to a minimal OSS test case.

One thing I'd be happy to do is to run a version of this that is a branch build that patches go runtime to hang rather than panic when this occurs, then I'd be happy to ssh in and dlv the binary.

lizthegrey avatar Aug 16 '22 15:08 lizthegrey

Change https://go.dev/cl/424196 mentions this issue: DO NOT SUBMIT: runtime: add dlog to gentraceback

gopherbot avatar Aug 16 '22 18:08 gopherbot

Change https://go.dev/cl/424195 mentions this issue: DO NOT SUBMIT: runtime: dump entire stack

gopherbot avatar Aug 16 '22 18:08 gopherbot

If you are willing to patch the runtime, there are a few changes that would be helpful.

  • https://go.dev/cl/424195: This just expands the stack dump to include the entire stack, which will help see more context. Since this only runs when crashing, it won't have any impact on normal execution.

  • https://go.dev/cl/424196: This adds debuglog logs to trace the behavior of gentraceback, which will help us see how to gets into the bad state. debuglog is a low-overhead in-memory ring buffer log in the runtime, which you must enable with go build -tags debuglog [1]. The log is dumped to stderr on crash. Without the build tag, the code is a complete no-op. With the build tag, it is logging every execution of gentraceback, so there may be a small performance degradation, but I don't think it will be noticeable in real-world applications.

I think the latter is a bit simpler than attaching with a debugger, since it can track iteration state (previous frames) that will be gone once we decide to crash. Plus you don't need to watch for hung tasks and manually inspect them, but that is an option as well if the patch is concerning.

[1] If adding tags to your build is difficult for some reason, you can also swap the build tag lines at the top of debuglog_off.go and debuglog_on.go.

prattmic avatar Aug 16 '22 18:08 prattmic

There's some slew since 1.18.5:

/usr/local/go/src/runtime/traceback.go:47:9: cannot use gp.goid (variable of type int64) as type uint64 in argument to d.u64

I'll fix the patch vs our base go version and try this overnight.

lizthegrey avatar Aug 16 '22 22:08 lizthegrey

Fresh crash for you with the additional telemetry.

Details
2022/08/17 15:22:56 maxprocs: Updating GOMAXPROCS=4: determined from CPU quota
time="2022-08-17T15:22:56Z" level=info msg="missing POD_NAME and POD_NAMESPACE in env, ok if not retriever"
time="2022-08-17T15:22:56Z" level=info msg="Debug service listening on :6060"
[LaunchDarkly] 2022/08/17 15:22:56 INFO: Starting LaunchDarkly client 5.0.2
[LaunchDarkly] 2022/08/17 15:22:56 INFO: Starting LaunchDarkly streaming connection
[LaunchDarkly] 2022/08/17 15:22:56 INFO: Waiting up to 30000 milliseconds for LaunchDarkly client to start...
[LaunchDarkly] 2022/08/17 15:22:56 INFO: Connecting to LaunchDarkly stream
[LaunchDarkly] 2022/08/17 15:22:56 INFO: LaunchDarkly streaming is active
[LaunchDarkly] 2022/08/17 15:22:56 INFO: Initialized LaunchDarkly client
runtime: traceback stuck. pc=0x12894 sp=0x40009d7840
stack: frame={sp:0x40009d7840, fp:0x40009d7840} stack=[0x40009d6000,0x40009d8000)
0x00000040009d7740:  0x000000000001cadc <runtime.mallocgc+0x000000000000068c>  0x00000040009d77b8 
0x00000040009d7750:  0x0000000000474210 <github.com/Shopify/sarama.(*realDecoder).pop+0x0000000000000060>  0x00000040009d77c8 
0x00000040009d7760:  0x0000000000000000  0x0000000000000000 
0x00000040009d7770:  0x0000000000000700  0x0000000000000004 
0x00000040009d7780:  0x00000040033b40c0  0x0000000000007701 
0x00000040009d7790:  0x0000000000000000  0x0000000000000000 
0x00000040009d77a0:  0x0000000000000000  0x0000004000600000 
0x00000040009d77b0:  0x0000004002b77c00  0x00000040009d77f8 
0x00000040009d77c0:  0x000000000047703c <github.com/Shopify/sarama.(*RecordBatch).decode+0x000000000000037c>  0x0000004008ec5980 
0x00000040009d77d0:  0x0000000000476f7c <github.com/Shopify/sarama.(*RecordBatch).decode+0x00000000000002bc>  0x00000040099ce05f 
0x00000040009d77e0:  0x000000000010a2ee <type..eq.[...]interface {}+0x000000000000003e>  0x0000000000000000 
0x00000040009d77f0:  0x0000000000000000  0x00000040009d78c8 
0x00000040009d7800:  0x00000000004777dc <github.com/Shopify/sarama.(*Records).decode+0x000000000000009c>  0x00000040033b40c0 
0x00000040009d7810:  0x0000004009ac5ab9  0x0000000000012894 <cmpbody+0x00000000000000e4> 
0x00000040009d7820:  0x0000000000012894 <cmpbody+0x00000000000000e4>  0x00000040009d7898 
0x00000040009d7830:  0x000128c50001ce80  0x0000000000012894 <cmpbody+0x00000000000000e4> 
0x00000040009d7840: >0x0000000000012894 <cmpbody+0x00000000000000e4>  0x00000000000000dc 
0x00000040009d7850:  0x0000000000000000  0x00000040009d7800 
0x00000040009d7860:  0x0000000000476da4 <github.com/Shopify/sarama.(*RecordBatch).decode+0x00000000000000e4>  0x00000040009d78b8 
0x00000040009d7870:  0x0000000000000000  0x0000000000000000 
0x00000040009d7880:  0x0000000000000000  0x0000000000000000 
0x00000040009d7890:  0x0000000000000000  0x0000004009ac5ab9 
0x00000040009d78a0:  0x0000004008ec5980  0x0000000000000000 
0x00000040009d78b0:  0x0000000000000000  0x00000000004772f0 <github.com/Shopify/sarama.(*RecordBatch).decode.func1+0x0000000000000000> 
0x00000040009d78c0:  0x0000004008ec5980  0x00000040009d7918 
0x00000040009d78d0:  0x000000000045a32c <github.com/Shopify/sarama.(*FetchResponseBlock).decode+0x000000000000033c>  0x0000004002cc3540 
0x00000040009d78e0:  0x0000000001c75e50  0x00000040033b40c0 
0x00000040009d78f0:  0x000000000045a31c <github.com/Shopify/sarama.(*FetchResponseBlock).decode+0x000000000000032c>  0x0000000000000018 
0x00000040009d7900:  0x00000000016a8080  0x00000040005cf601 
0x00000040009d7910:  0x00000040027f6300  0x00000040009d79c8 
0x00000040009d7920:  0x000000000045acf8 <github.com/Shopify/sarama.(*FetchResponse).decode+0x00000000000002a8>  0x000000400710dad0 
0x00000040009d7930:  0x0000000001c75e50  0x00000040033b40c0 
fatal error: traceback stuck

goroutine 0 [idle]:
runtime: unexpected return pc for runtime.sigtramp called from 0xffffaac795c0
stack: frame={sp:0x40006cacd0, fp:0x40006cada0} stack=[0x40006c4000,0x40006cc000)
0x00000040006cabd0:  0x00000040006cac28  0x0000000000000000 
0x00000040006cabe0:  0x0000000000000000  0x00000040006c2000 
0x00000040006cabf0:  0x0000000000000000  0x0000000000000000 
0x00000040006cac00:  0x0000000000000000  0x00000040006cada0 
0x00000040006cac10:  0x00000040006cae20  0x00000040006cac98 
0x00000040006cac20:  0x000000000007d35c <runtime.sigtrampgo+0x000000000000001c>  0x000000400000001b 
0x00000040006cac30:  0x00000040006c0000  0x00000040006cac58 
0x00000040006cac40:  0x0000004007a6bd40  0x0000000000000000 
0x00000040006cac50:  0x0000000000000000  0x0000000000000000 
0x00000040006cac60:  0x0000000000000000  0x0000000000000000 
0x00000040006cac70:  0x0000000000000000  0x0000000000000000 
0x00000040006cac80:  0x0000004007a6bd40  0x00000040006cada0 
0x00000040006cac90:  0x00000040006cae20  0x00000040006cacc8 
0x00000040006caca0:  0x000000000007cc74 <runtime.sigtramp+0x0000000000000084>  0x000000000000001b 
0x00000040006cacb0:  0x00000040006cada0  0x00000040006cae20 
0x00000040006cacc0:  0x0000000000000000  0x00000040006cbff0 
0x00000040006cacd0: <0x0000ffffaac795c0  0x000000000000001b 
0x00000040006cace0:  0x00000040006cada0  0x00000040006cae20 
0x00000040006cacf0:  0x000000000004a4fd <runtime.main+0x00000000000002ed>  0x070a091214d6eaae 
0x00000040006cad00:  0x226e776f6e6b6e75  0x3464623239303924 
0x00000040006cad10:  0x342d356131632d61  0x000000000000000d 
0x00000040006cad20:  0x00000000000001dd  0x0000000000000000 
0x00000040006cad30:  0x0000000001c5b300  0x0000004007a6bd40 
0x00000040006cad40:  0x00000040006cacc8  0x3e66376972bea4d0 
0x00000040006cad50:  0xbebbbd41c5d26bf1  0x3f11566aaf25de2c 
0x00000040006cad60:  0xbf66c16c16bebd93  0x3fc5555555555555 
0x00000040006cad70:  0x3fc7642f0ac54c41  0x3ffd137a1ea75678 
0x00000040006cad80:  0xbfca94327912c722  0x0000000000000000 
0x00000040006cad90:  0x0000000000000000  0x0000000000000000 
0x00000040006cada0: >0x000000000000001b  0x00000000fffffffe 
0x00000040006cadb0:  0x0000000000000136  0x0000000000000000 
0x00000040006cadc0:  0x0000000000000000  0x0000000000000000 
0x00000040006cadd0:  0x0000000000000000  0x0000000000000000 
0x00000040006cade0:  0x0000000000000000  0x0000000000000000 
0x00000040006cadf0:  0x0000000000000000  0x0000000000000000 
0x00000040006cae00:  0x0000000000000000  0x0000000000000000 
0x00000040006cae10:  0x0000000000000000  0x0000000000000000 
0x00000040006cae20:  0x0000000000000000  0x0000000000000000 
0x00000040006cae30:  0x00000040006c4000  0x0000000000000000 
0x00000040006cae40:  0x0000000000008000  0x0000000000000000 
0x00000040006cae50:  0x0000000000000000  0x0000000000000000 
0x00000040006cae60:  0x0000000000000000  0x0000000000000000 
0x00000040006cae70:  0x0000000000000000  0x0000000000000000 
0x00000040006cae80:  0x0000000000000000  0x0000000000000000 
0x00000040006cae90:  0x0000000000000000  0x0000000000000000 
runtime.throw({0x1894577?, 0x2a981a0?})
	/usr/local/go/src/runtime/panic.go:992 +0x50
runtime.gentraceback(0x2b3d3a0?, 0x0?, 0x40006ca968?, 0x4007a6bd40, 0x0, 0x40006ca968, 0x40, 0x0, 0x0?, 0x6)
	/usr/local/go/src/runtime/traceback.go:488 +0xd98
runtime.sigprof(0x79410, 0x40006cac08?, 0x0?, 0x4007a6bd40, 0x40006c0000)
	/usr/local/go/src/runtime/proc.go:4507 +0xf0
runtime.sighandler(0x1b?, 0x40006c0000?, 0x40006cac58?, 0x4007a6bd40?)
	/usr/local/go/src/runtime/signal_unix.go:613 +0x514
runtime.sigtrampgo(0x1b, 0x40006cada0, 0x40006cae20)
	/usr/local/go/src/runtime/signal_unix.go:477 +0x164
runtime.sigtrampgo(0x1b, 0x40006cada0, 0x40006cae20)
	<autogenerated>:1 +0x1c
runtime: unexpected return pc for runtime.sigtramp called from 0xffffaac795c0
stack: frame={sp:0x40006cacd0, fp:0x40006cada0} stack=[0x40006c4000,0x40006cc000)
0x00000040006cabd0:  0x00000040006cac28  0x0000000000000000 
0x00000040006cabe0:  0x0000000000000000  0x00000040006c2000 
0x00000040006cabf0:  0x0000000000000000  0x0000000000000000 
0x00000040006cac00:  0x0000000000000000  0x00000040006cada0 
0x00000040006cac10:  0x00000040006cae20  0x00000040006cac98 
0x00000040006cac20:  0x000000000007d35c <runtime.sigtrampgo+0x000000000000001c>  0x000000400000001b 
0x00000040006cac30:  0x00000040006c0000  0x00000040006cac58 
0x00000040006cac40:  0x0000004007a6bd40  0x0000000000000000 
0x00000040006cac50:  0x0000000000000000  0x0000000000000000 
0x00000040006cac60:  0x0000000000000000  0x0000000000000000 
0x00000040006cac70:  0x0000000000000000  0x0000000000000000 
0x00000040006cac80:  0x0000004007a6bd40  0x00000040006cada0 
0x00000040006cac90:  0x00000040006cae20  0x00000040006cacc8 
0x00000040006caca0:  0x000000000007cc74 <runtime.sigtramp+0x0000000000000084>  0x000000000000001b 
0x00000040006cacb0:  0x00000040006cada0  0x00000040006cae20 
0x00000040006cacc0:  0x0000000000000000  0x00000040006cbff0 
0x00000040006cacd0: <0x0000ffffaac795c0  0x000000000000001b 
0x00000040006cace0:  0x00000040006cada0  0x00000040006cae20 
0x00000040006cacf0:  0x000000000004a4fd <runtime.main+0x00000000000002ed>  0x070a091214d6eaae 
0x00000040006cad00:  0x226e776f6e6b6e75  0x3464623239303924 
0x00000040006cad10:  0x342d356131632d61  0x000000000000000d 
0x00000040006cad20:  0x00000000000001dd  0x0000000000000000 
0x00000040006cad30:  0x0000000001c5b300  0x0000004007a6bd40 
0x00000040006cad40:  0x00000040006cacc8  0x3e66376972bea4d0 
0x00000040006cad50:  0xbebbbd41c5d26bf1  0x3f11566aaf25de2c 
0x00000040006cad60:  0xbf66c16c16bebd93  0x3fc5555555555555 
0x00000040006cad70:  0x3fc7642f0ac54c41  0x3ffd137a1ea75678 
0x00000040006cad80:  0xbfca94327912c722  0x0000000000000000 
0x00000040006cad90:  0x0000000000000000  0x0000000000000000 
0x00000040006cada0: >0x000000000000001b  0x00000000fffffffe 
0x00000040006cadb0:  0x0000000000000136  0x0000000000000000 
0x00000040006cadc0:  0x0000000000000000  0x0000000000000000 
0x00000040006cadd0:  0x0000000000000000  0x0000000000000000 
0x00000040006cade0:  0x0000000000000000  0x0000000000000000 
0x00000040006cadf0:  0x0000000000000000  0x0000000000000000 
0x00000040006cae00:  0x0000000000000000  0x0000000000000000 
0x00000040006cae10:  0x0000000000000000  0x0000000000000000 
0x00000040006cae20:  0x0000000000000000  0x0000000000000000 
0x00000040006cae30:  0x00000040006c4000  0x0000000000000000 
0x00000040006cae40:  0x0000000000008000  0x0000000000000000 
0x00000040006cae50:  0x0000000000000000  0x0000000000000000 
0x00000040006cae60:  0x0000000000000000  0x0000000000000000 
0x00000040006cae70:  0x0000000000000000  0x0000000000000000 
0x00000040006cae80:  0x0000000000000000  0x0000000000000000 
0x00000040006cae90:  0x0000000000000000  0x0000000000000000 
runtime.sigtramp()
	/usr/local/go/src/runtime/sys_linux_arm64.s:489 +0x84

goroutine 2446 [running]:

lizthegrey avatar Aug 17 '22 15:08 lizthegrey

There's some slew since 1.18.5:

Oops, apologies, I made this CL based on tip rather than 1.18. Did you get this fixed? If not I can provide another on 1.18.

Fresh crash for you with the additional telemetry.

Did this get built with -tags debuglog? The is missing, though perhaps just not included in your paste. The log will be output after all of the crash stack traces just before exit.

prattmic avatar Aug 17 '22 16:08 prattmic

Did this get built with -tags debuglog? The is missing, though perhaps just not included in your paste. The log will be output after all of the crash stack traces just before exit.

I thought it did, but apparently it's missing them. Does go install accept tags, or only go build?

                    go install -tags debuglog -ldflags "${LD_FLAGS} \
                      -X github.com/honeycombio/hound/env.buildID=$BUILD_ID \
                      -X github.com/honeycombio/hound/env.commitHash=$CIRCLE_SHA1" \
                      ./...

lizthegrey avatar Aug 17 '22 16:08 lizthegrey

Yes, go install accepts tags.

I also see that the stack dump doesn't include the entire stack either (https://go.dev/cl/424195). I'd guess that your build isn't picking up the patches to the runtime at all (e.g., wrong GOROOT), except that it clearly did when you got a build error...

You could double check the symbols in the binary. If debuglog is enabled, it should have various dlogger symbols like runtime.(*dlogger).end, if not those are missing.

prattmic avatar Aug 17 '22 16:08 prattmic

oh, whoops, good catch.

    Roles pinned:
        beagle: 603443
        barbet: 603443

that binary is pinned to an older build. the build instructions were correct, prod just didn't have the correct one. egg on my face, sorry, I'll re-try and let you know when we have a crash with the correct build.

lizthegrey avatar Aug 17 '22 16:08 lizthegrey

Thanks. By the way, do you happen to be using musl libc (/Alpine Linux)? (Is this even a cgo binary?)

I'm investigating #54306 in parallel, presumably an issue with musl. It is probably unrelated, but it would help to know if you are using musl.

prattmic avatar Aug 17 '22 17:08 prattmic

Thanks. By the way, do you happen to be using musl libc (/Alpine Linux)? (Is this even a cgo binary?)

I'm investigating #54306 in parallel, presumably an issue with musl. It is probably unrelated, but it would help to know if you are using musl.

No, we run on Ubuntu 22.04 containers with standard libc 2.35-0ubuntu3.1. No use of cgo

lizthegrey avatar Aug 17 '22 17:08 lizthegrey

https://gist.github.com/lizthegrey/043da22367603cd24acf82c13742d830

lizthegrey avatar Aug 17 '22 17:08 lizthegrey

Thanks, I'll take a look! Does this one have debuglog output at the end? (Lots of lines mostly containing gentrace: ...)

prattmic avatar Aug 17 '22 18:08 prattmic

https://gist.github.com/lizthegrey/f5827b5577abf9b0e3c883eeaf748361

lizthegrey avatar Aug 17 '22 18:08 lizthegrey

Oh, no, there is nothing with "gentrace" at the end.

lizthegrey avatar Aug 17 '22 18:08 lizthegrey

Hrm, that is quite frustrating.

What I can tell from the stack dump is that the SP/FP we've landed at is certainly not a valid frame pointer. It is hard to tell, but the stack doesn't look obviously corrupt to me. Perhaps the stack is OK, but we computed the wrong next frame pointer.

I'll see if I can put more information in another patch, even if debuglog fails.

prattmic avatar Aug 17 '22 20:08 prattmic

New patches, now based on the 1.18 release branch, so they should apply cleanly:

  • https://go.dev/cl/424076: Identical to https://go.dev/cl/424195.
  • https://go.dev/cl/424077: Similar to https://go.dev/cl/424196. Adds the previous frame to the panic output so we get that even if debuglog is still broken. Also enables debuglog by default to try to get that working. As a result, you should not pass -tags debuglog.

prattmic avatar Aug 17 '22 20:08 prattmic

Change https://go.dev/cl/424077 mentions this issue: DO NOT SUBMIT: runtime: add dlog to gentraceback

gopherbot avatar Aug 17 '22 20:08 gopherbot

Change https://go.dev/cl/424076 mentions this issue: DO NOT SUBMIT: runtime: dump entire stack

gopherbot avatar Aug 17 '22 20:08 gopherbot

Go ahead and base on 1.19 going forward, I just upgraded our prod environments...

This patchset worked fine though with offset:

patching file src/runtime/traceback.go
patching file src/runtime/debuglog_off.go
patching file src/runtime/debuglog_on.go
patching file src/runtime/proc.go
Hunk #1 succeeded at 4511 (offset 60 lines).
patching file src/runtime/traceback.go

lizthegrey avatar Aug 18 '22 00:08 lizthegrey

Still not seeing debuglog in output. Are we terminating before debuglog can be printed?

https://gist.github.com/lizthegrey/58174372484922b47c69c147a4c9f1f8

lizthegrey avatar Aug 18 '22 03:08 lizthegrey

It's possible, though I'm surprised there is no further indication of crash if true. Do you see "stack trace unavailable" at the end? Curiously, I've been using the same patch to investigate #54306 and it is working fine.

On another note, I've noticed that most of the stacks so far seem to be in code around github.com/Shopify/sarama.(*RecordBatch).decode (and various other sarama decode functions) which calls into github.com/klauspost/compress/zstd functions. Do you know (from profiles or intuition) if this code path is very hot in this application (so crashing here is likely just coincidence), or fairly rare (so perhaps not a coincidence)? There was one crash above in github.com/honeycombio/libhoney-go, so I lean towards coincidence.

prattmic avatar Aug 18 '22 14:08 prattmic

@lizthegrey @prattmic thanks for the information! The most recent one doesn't have morestack or systemstack, so it may be not limited to that.

I'm still puzzled by the unaligned-PC-like values, e.g. 0x11b2b, which is <runtime/internal/atomic.(*Uint8).And+0x000000000000003b> (it is not only unaligned but also past the end of the function, so the execution should never get there).

It may be helpful to print the PCs it has unwound through, i.e. the content of pcbuf, so we can see how it gets there. (SPs would also be good, but they are not stored anywhere so it's hard to print. PCs plus the content of the stack is probably enough.)

@lizthegrey do you have any run without profiling turned on, and if so, is there any crash? It looks to me that the content of the stack is weird so even a non-profiling run might crash. Thanks.

cherrymui avatar Aug 18 '22 15:08 cherrymui

I'm still puzzled by the unaligned-PC-like values, e.g. 0x11b2b

I was confused by those as well. There are some really odd ones like runtime.debugCallWrap.func1. My guess was that since the values are fairly small, they are probably just random arguments to these functions and not PCs at all.

prattmic avatar Aug 18 '22 15:08 prattmic

@lizthegrey do you have any run without profiling turned on, and if so, is there any crash? It looks to me that the content of the stack is weird so even a non-profiling run might crash. Thanks.

No profiling = no crash. We've had copies of some of these binaries running without profiling for months, that have not crashed, when we enabled profiling they immediately started crashing about once per day out of 140 processes.

lizthegrey avatar Aug 18 '22 15:08 lizthegrey

It's possible, though I'm surprised there is no further indication of crash if true. Do you see "stack trace unavailable" at the end? Curiously, I've been using the same patch to investigate #54306 and it is working fine.

Oddly, no. What's also odd is that the kubectl log output stops after a small number of goroutines, rather than printing all the way through. I'll see if our fluentbit exporter got the entire thing and report back.

On another note, I've noticed that most of the stacks so far seem to be in code around github.com/Shopify/sarama.(*RecordBatch).decode (and various other sarama decode functions) which calls into github.com/klauspost/compress/zstd functions. Do you know (from profiles or intuition) if this code path is very hot in this application (so crashing here is likely just coincidence), or fairly rare (so perhaps not a coincidence)? There was one crash above in github.com/honeycombio/libhoney-go, so I lean towards coincidence.

It's a coincidence, sarama/zstd is about 30-40% of the workload of this app, and libhoney-go is 10%.

lizthegrey avatar Aug 18 '22 15:08 lizthegrey

What's also odd is that the kubectl log output stops after a small number of goroutines

Huh. I'd noticed that your logs stop at the second goroutine header and assumed you were just manually truncating. Is that actually where the output stops?

I've updated https://go.dev/cl/424196 to:

  • Print debuglog before the panic tracebacks. Maybe this will get it out?
  • Remember and dump the first 32 frames we walk (which is primarily what I want out of debuglog).
  • Dump the contents of pcbuf.

Could you give it a try?

prattmic avatar Aug 18 '22 20:08 prattmic

Huh. I'd noticed that your logs stop at the second goroutine header and assumed you were just manually truncating. Is that actually where the output stops?

It is indeed. on a recent crash, even pulling the logs from fluentbit, they're truncated directly after the second goroutine:

	/usr/local/go/src/runtime/sys_linux_arm64.s:471 +0x58 fp=0x400081cda0 sp=0x400081cce0 pc=0x80258

goroutine 469267 [running]:

and then it stops.

I've updated go.dev/cl/424196 to:

  • Print debuglog before the panic tracebacks. Maybe this will get it out?
  • Remember and dump the first 32 frames we walk (which is primarily what I want out of debuglog).
  • Dump the contents of pcbuf.

Could you give it a try?

Perfect, will give it a try.

lizthegrey avatar Aug 18 '22 20:08 lizthegrey

Finally got a good dump including debug logs for you. https://gist.github.com/lizthegrey/0ed6b93167f618fa0d09becf975888c4

lizthegrey avatar Aug 19 '22 03:08 lizthegrey

Awesome, thanks!

Here's what I see:

Caveat: The debuglog is still a bit corrupted, I guess because I printed early while other threads were still writing to it. That's OK though, the frame history has most of what I want.

This trace goes through libhoney.(*fieldHolder).AddField. See the source here.

previous frame 0 {pc:0x7cdc0, lr:0x24cb8, sp:0x406e1b24c0, fp:0x406e1b24c0}
previous frame 1 {pc:0x24cb8, lr:0x241a4, sp:0x406e1b24c0, fp:0x406e1b24c0}

I'm not certain what these functions are, as they don't appear in the stack dump.

Based on my own build of 1.19, mapassign_faststr+0x114 is the return site from runtime.growWork_faststr. The offset from there is 0x24cb8 yields the return site from runtime.morestack_noctxt.abi0 in runtime.evacuate_faststr. So 0x7cdc0 is probably in morestack?

previous frame 2 {pc:0x241a4, lr:0x241a4, sp:0x406e1b24c0, fp:0x406e1b2530}
previous frame 3 {pc:0x241a4, lr:0x59244c, sp:0x406e1b2530, fp:0x406e1b25a0}

0x241a4 is <runtime.mapassign_faststr+0x0000000000000114>. At the moment, I am not sure why there is an extra frame with pc == lr. It might be because 0x241a4 is actually an instruction from runtime.tophash inlined into runtime.mapassign_faststr.

previous frame 4 {pc:0x59244c, lr:0x592520, sp:0x406e1b25a0, fp:0x406e1b2600}

0x59244c is <github.com/honeycombio/libhoney-go.(*fieldHolder).AddField+0x000000000000004c>. Based on my build of libhoney, AddField+0x4c is the return site from sync.(*RWMutex).Lock, not runtime.mapassign_faststr (there is also a call to that later in the function at +0x90). This seems wrong...

previous frame 5 {pc:0x592520, lr:0x5932b0, sp:0x406e1b2600, fp:0x406e1b2600}

0x0592520 is <github.com/honeycombio/libhoney-go.(*fieldHolder).AddField.func1+0x0000000000000000>. AddField.func1 is the wrapper for defer f.lock.Unlock(). This doesn't make sense, because the wrapper does not call the main function, plus AddField.func1+0x0 is the beginning of the function and definitely not a CALL.

Presumably we get stuck here because the SP delta is 0, as SP hasn't been adjusted at function entry.

(I will note that the instruction just before this is JMP github.com/honeycombio/libhoney-go.(*fieldHolder).AddField(SB) at the end of AddField, for restarting after morestack. Still not a CALL, but a typical pc-4 would put us in a different function).

I see lots of oddities here, though I'm not sure yet exactly where the problem is.

@lizthegrey could you help double-check if my compilation of libhoney matches yours by posting the disassembly of github.com/honeycombio/libhoney-go.(*fieldHolder).AddField and github.com/honeycombio/libhoney-go.(*fieldHolder).AddField.func1 from the actual binary that crashed? (go tool objdump <binary>, look for TEXT github.com/honeycombio/libhoney-go.(*fieldHolder).AddField).

Also please send over additional crashes if you get them.

prattmic avatar Aug 19 '22 15:08 prattmic

Thanks!

previous frame 0 {pc:0x7cdc0, lr:0x24cb8, sp:0x406e1b24c0, fp:0x406e1b24c0}
previous frame 1 {pc:0x24cb8, lr:0x241a4, sp:0x406e1b24c0, fp:0x406e1b24c0}
previous frame 2 {pc:0x241a4, lr:0x241a4, sp:0x406e1b24c0, fp:0x406e1b2530}

This is weird. It shouldn't be possible to have three frames to have the same SP. And frame 2's PC = LR, which also makes no sense. Do you know what functions are for PC 0x7cdc0 and 0x24cb8? Thanks.

It is also interesting that debuglog doesn't seem to include these frames...

cherrymui avatar Aug 19 '22 15:08 cherrymui

Additional crashes all appended to the same secret gist (see traceback{2,3,4,...})

code for Sarama and Libhoney should be available in OSS -- see https://github.com/honeycombio/libhoney-go and https://github.com/shopify/sarama

I'll post the disassembly as seen in our compiled code.

lizthegrey avatar Aug 19 '22 15:08 lizthegrey

in the binary (shepherd) that generated tracebacks 1-3:

TEXT github.com/honeycombio/libhoney-go.(*Client).AddField(SB) /mnt/ramdisk/go/pkg/mod/github.com/honeycombio/[email protected]/client.go
  client.go:193         0x5913e0                f9400b90                MOVD 16(R28), R16                                                               
  client.go:193         0x5913e4                eb3063ff                CMP R16, RSP                                                                    
  client.go:193         0x5913e8                54000309                BLS 24(PC)                                                                      
  client.go:193         0x5913ec                f81c0ffe                MOVD.W R30, -64(RSP)                                                            
  client.go:193         0x5913f0                f81f83fd                MOVD R29, -8(RSP)                                                               
  client.go:193         0x5913f4                d10023fd                SUB $8, RSP, R29                                                                
  client.go:193         0x5913f8                f9002be1                MOVD R1, 80(RSP)                                                                
  client.go:193         0x5913fc                f90027e0                MOVD R0, 72(RSP)                                                                
  client.go:193         0x591400                f9002fe2                MOVD R2, 88(RSP)                                                                
  client.go:193         0x591404                f90033e3                MOVD R3, 96(RSP)                                                                
  client.go:193         0x591408                f90037e4                MOVD R4, 104(RSP)                                                               
  client.go:194         0x59140c                97fffe7d                CALL github.com/honeycombio/libhoney-go.(*Client).ensureTransmission(SB)        
  client.go:195         0x591410                f94027e0                MOVD 72(RSP), R0                                                                
  client.go:195         0x591414                97fffefb                CALL github.com/honeycombio/libhoney-go.(*Client).ensureBuilder(SB)             
  client.go:196         0x591418                f94027e1                MOVD 72(RSP), R1                                                                
  client.go:196         0x59141c                f9401021                MOVD 32(R1), R1                                                                 
  client.go:196         0x591420                3980003b                MOVB (R1), R27                                                                  
  client.go:196         0x591424                9100e020                ADD $56, R1, R0                                                                 
  client.go:196         0x591428                f9402be1                MOVD 80(RSP), R1                                                                
  client.go:196         0x59142c                f9402fe2                MOVD 88(RSP), R2                                                                
  client.go:196         0x591430                f94033e3                MOVD 96(RSP), R3                                                                
  client.go:196         0x591434                f94037e4                MOVD 104(RSP), R4                                                               
  client.go:196         0x591438                940003f2                CALL github.com/honeycombio/libhoney-go.(*fieldHolder).AddField(SB)             
  client.go:197         0x59143c                a97ffbfd                LDP -8(RSP), (R29, R30)                                                         
  client.go:197         0x591440                910103ff                ADD $64, RSP, RSP                                                               
  client.go:197         0x591444                d65f03c0                RET                                                                             
  client.go:193         0x591448                f90007e0                MOVD R0, 8(RSP)                                                                 
  client.go:193         0x59144c                f9000be1                MOVD R1, 16(RSP)                                                                
  client.go:193         0x591450                f9000fe2                MOVD R2, 24(RSP)                                                                
  client.go:193         0x591454                f90013e3                MOVD R3, 32(RSP)                                                                
  client.go:193         0x591458                f90017e4                MOVD R4, 40(RSP)                                                                
  client.go:193         0x59145c                aa1e03e3                MOVD R30, R3                                                                    
  client.go:193         0x591460                97ebae58                CALL runtime.morestack_noctxt.abi0(SB)                                          
  client.go:193         0x591464                f94007e0                MOVD 8(RSP), R0                                                                 
  client.go:193         0x591468                f9400be1                MOVD 16(RSP), R1                                                                
  client.go:193         0x59146c                f9400fe2                MOVD 24(RSP), R2                                                                
  client.go:193         0x591470                f94013e3                MOVD 32(RSP), R3                                                                
  client.go:193         0x591474                f94017e4                MOVD 40(RSP), R4                                                                
  client.go:193         0x591478                17ffffda                JMP github.com/honeycombio/libhoney-go.(*Client).AddField(SB)                   
  client.go:193         0x59147c                00000000                ?                                                                               
TEXT github.com/honeycombio/libhoney-go.(*fieldHolder).AddField.func1(SB) /mnt/ramdisk/go/pkg/mod/github.com/honeycombio/[email protected]/libhoney.go
  libhoney.go:567       0x592520                f9400b90                MOVD 16(R28), R16                                                               
  libhoney.go:567       0x592524                eb3063ff                CMP R16, RSP                                                                    
  libhoney.go:567       0x592528                54000169                BLS 11(PC)                                                                      
  libhoney.go:567       0x59252c                f81e0ffe                MOVD.W R30, -32(RSP)                                                            
  libhoney.go:567       0x592530                f81f83fd                MOVD R29, -8(RSP)                                                               
  libhoney.go:567       0x592534                d10023fd                SUB $8, RSP, R29                                                                
  libhoney.go:567       0x592538                f9401390                MOVD 32(R28), R16                                                               
  libhoney.go:567       0x59253c                b5000130                CBNZ R16, 9(PC)                                                                 
  libhoney.go:567       0x592540                f9400740                MOVD 8(R26), R0                                                                 
  libhoney.go:567       0x592544                97ebdf5b                CALL sync.(*RWMutex).Unlock(SB)                                                 
  libhoney.go:567       0x592548                a97ffbfd                LDP -8(RSP), (R29, R30)                                                         
  libhoney.go:567       0x59254c                910083ff                ADD $32, RSP, RSP                                                               
  libhoney.go:567       0x592550                d65f03c0                RET                                                                             
  libhoney.go:567       0x592554                aa1e03e3                MOVD R30, R3                                                                    
  libhoney.go:567       0x592558                97eba9fa                CALL runtime.morestack.abi0(SB)                                                 
  libhoney.go:567       0x59255c                17fffff1                JMP github.com/honeycombio/libhoney-go.(*fieldHolder).AddField.func1(SB)        
  libhoney.go:567       0x592560                f9400211                MOVD (R16), R17                                                                 
  libhoney.go:567       0x592564                9100a3f4                ADD $40, RSP, R20                                                               
  libhoney.go:567       0x592568                eb11029f                CMP R17, R20                                                                    
  libhoney.go:567       0x59256c                54fffea1                BNE -11(PC)                                                                     
  libhoney.go:567       0x592570                910023f4                ADD $8, RSP, R20                                                                
  libhoney.go:567       0x592574                f9000214                MOVD R20, (R16)                                                                 
  libhoney.go:567       0x592578                17fffff2                JMP -14(PC)                                                                     
  libhoney.go:567       0x59257c                00000000                ?                                                                               

traceback 4 is from a different binary, let me know what disasm you need from that.

lizthegrey avatar Aug 19 '22 15:08 lizthegrey

0x7cdc0

TEXT github.com/gogo/protobuf/proto.makeTimePtrSliceMarshaler.func2(SB) /mnt/ramdisk/go/pkg/mod/github.com/gogo/[email protected]/proto/table_marshal_gogo.go
[...]
  value.go:1616                 0x7cdbf8                f1005cff                CMP $23, R7                                                             
  value.go:1616                 0x7cdbfc                54000061                BNE 3(PC)                                                               
  value.go:1617                 0x7cdc00                f9400427                MOVD 8(R1), R7                                                          
  table_marshal_gogo.go:257     0x7cdc04                1400000a                JMP 10(PC)                                                              
  value.go:1619                 0x7cdc08                97e41bae                CALL reflect.Value.lenNonSlice(SB)                                      
  table_marshal_gogo.go:258     0x7cdc0c                f9404be1                MOVD 144(RSP), R1        

0x24cb8

TEXT hash/crc32.MakeTable(SB) /usr/local/go/src/hash/crc32/crc32.go
[...]
  crc32_generic.go:29   0x24cb80                14000005                JMP 5(PC)                               
  crc32.go:137          0x24cb84                a97ffbfd                LDP -8(RSP), (R29, R30)                 
  crc32.go:137          0x24cb88                9100c3ff                ADD $48, RSP, RSP                       
  crc32.go:137          0x24cb8c                d65f03c0                RET                   

lizthegrey avatar Aug 19 '22 15:08 lizthegrey

libhoney-go

I've identified why your build output for libhoney-go may be different from mine. this repo is lagging the mainline a bit.

go 1.18

require (
[...]
        github.com/honeycombio/libhoney-go v1.15.8
        github.com/Shopify/sarama v1.31.0

lizthegrey avatar Aug 19 '22 15:08 lizthegrey

TEXT github.com/honeycombio/libhoney-go.(*Client).AddField(SB) 

This is Client.AddField. Could you post fieldHolder.AddField?

Also for 0x7cdc0, it looks like you posted 0x7cdc00 instead (off by an order of magnitude). Same with 0x24cb8.

prattmic avatar Aug 19 '22 16:08 prattmic

So 0x7cdc0 is probably in morestack

If this is in morestack, I think that makes sense. morestack itself doesn't have a frame, so that's "frame 0" above. Its caller (evacuate_faststr) hasn't created a frame yet, so it also doesn't have a frame, which is "frame 1" above.

Frame 1 should have its return address in growWork_faststr, but somehow that is missing, and directly skip to mapassign_faststr?

My guess is that a profiling signal lands in morestack but before the stack switch. The jumpstack logic doesn't kick in as it hasn't switched. At this point PC=morestack, LR=caller of morestack, i.e. evacuate_faststr. And it unwinds a frame normally.

At the second frame, nothing sets the LR specially, so at this point https://cs.opensource.google/go/go/+/master:src/runtime/traceback.go;l=228 it just loads the LR from the stack. But evacuate_faststr hasn't actually saved LR on stack, so it loads the LR of its caller, which is the LR of growWork_faststr, which is mapassgn_faststr`. I think that explains the missing frame. I think the logic of the code assumes only the last frame can be frameless (the n==0 check), all following frames have LR saved on stack, so it assumes it is safe to load. But the morestack-but-havevn't-switched case breaks that logic...

cherrymui avatar Aug 19 '22 16:08 cherrymui

morestack writes SP, so it should have SPWRITE bit set. If the jump stack logic doesn't kick in, it should stop here https://cs.opensource.google/go/go/+/master:src/runtime/traceback.go;l=201 . Unfortunate for profiling, but doesn't crash.

However, morestack_noctxt doesn't write SP, so if a signal lands there, it tries to unwind normally... Maybe a stop-gap fix is to mark morestack_noctxt SPWRITE...

cherrymui avatar Aug 19 '22 16:08 cherrymui

TEXT github.com/honeycombio/libhoney-go.(*Client).AddField(SB) 

This is Client.AddField. Could you post fieldHolder.AddField?

Also for 0x7cdc0, it looks like you posted 0x7cdc00 instead (off by an order of magnitude). Same with 0x24cb8.

Oh, whoops, sorry. here you go. (shepherd binary, tracebacks 1-3)

TEXT github.com/honeycombio/libhoney-go.(*fieldHolder).AddField(SB) /mnt/ramdisk/go/pkg/mod/github.com/honeycombio/[email protected]/libhoney.go
  libhoney.go:565       0x592400                f9400b90                MOVD 16(R28), R16                                                       
  libhoney.go:565       0x592404                eb3063ff                CMP R16, RSP                                                            
  libhoney.go:565       0x592408                54000729                BLS 57(PC)                                                              
  libhoney.go:565       0x59240c                f81a0ffe                MOVD.W R30, -96(RSP)                                                    
  libhoney.go:565       0x592410                f81f83fd                MOVD R29, -8(RSP)                                                       
  libhoney.go:565       0x592414                d10023fd                SUB $8, RSP, R29                                                        
  libhoney.go:565       0x592418                f90043e3                MOVD R3, 128(RSP)                                                       
  libhoney.go:565       0x59241c                f9002bff                MOVD ZR, 80(RSP)                                                        
  libhoney.go:565       0x592420                f90037e0                MOVD R0, 104(RSP)                                                       
  libhoney.go:565       0x592424                f9003fe2                MOVD R2, 120(RSP)                                                       
  libhoney.go:565       0x592428                f9003be1                MOVD R1, 112(RSP)                                                       
  libhoney.go:568       0x59242c                f9001fe3                MOVD R3, 56(RSP)                                                        
  libhoney.go:568       0x592430                f90047e4                MOVD R4, 136(RSP)                                                       
  libhoney.go:565       0x592434                3900bfff                MOVB ZR, 47(RSP)                                                        
  libhoney.go:566       0x592438                3980001b                MOVB (R0), R27                                                          
  libhoney.go:566       0x59243c                91002005                ADD $8, R0, R5                                                          
  libhoney.go:566       0x592440                f9001be5                MOVD R5, 48(RSP)                                                        
  libhoney.go:566       0x592444                aa0503e0                MOVD R5, R0                                                             
  libhoney.go:566       0x592448                97ebdef6                CALL sync.(*RWMutex).Lock(SB)                                           
  libhoney.go:567       0x59244c                a9047fff                STP (ZR, ZR), 64(RSP)                                                   
  libhoney.go:567       0x592450                90000001                ADRP 0(PC), R1                                                          
  libhoney.go:567       0x592454                91148021                ADD $1312, R1, R1                                                       
  libhoney.go:567       0x592458                f90023e1                MOVD R1, 64(RSP)                                                        
  libhoney.go:567       0x59245c                f9401be1                MOVD 48(RSP), R1                                                        
  libhoney.go:567       0x592460                f90027e1                MOVD R1, 72(RSP)                                                        
  libhoney.go:567       0x592464                910103e1                ADD $64, RSP, R1                                                        
  libhoney.go:567       0x592468                f9002be1                MOVD R1, 80(RSP)                                                        
  libhoney.go:567       0x59246c                b24003e1                ORR $1, ZR, R1                                                          
  libhoney.go:567       0x592470                3900bfe1                MOVB R1, 47(RSP)                                                        
  libhoney.go:568       0x592474                f94037e1                MOVD 104(RSP), R1                                                       
  libhoney.go:568       0x592478                f9400021                MOVD (R1), R1                                                           
  libhoney.go:568       0x59247c                90008e80                ADRP 18677760(PC), R0                                                   
  libhoney.go:568       0x592480                91210000                ADD $2112, R0, R0                                                       
  libhoney.go:568       0x592484                f9403be2                MOVD 112(RSP), R2                                                       
  libhoney.go:568       0x592488                f9403fe3                MOVD 120(RSP), R3                                                       
  libhoney.go:568       0x59248c                97ea4701                CALL runtime.mapassign_faststr(SB)                                      
  libhoney.go:568       0x592490                f9401fe1                MOVD 56(RSP), R1                                                        
  libhoney.go:568       0x592494                f9000001                MOVD R1, (R0)                                                           
  libhoney.go:568       0x592498                f001429b                ADRP 42283008(PC), R27                                                  
  libhoney.go:568       0x59249c                9124837b                ADD $2336, R27, R27                                                     
  libhoney.go:568       0x5924a0                b9400361                MOVWU (R27), R1                                                         
  libhoney.go:568       0x5924a4                35000081                CBNZW R1, 4(PC)                                                         
  libhoney.go:568       0x5924a8                f94047e1                MOVD 136(RSP), R1                                                       
  libhoney.go:568       0x5924ac                f9000401                MOVD R1, 8(R0)                                                          
  libhoney.go:568       0x5924b0                14000004                JMP 4(PC)                                                               
  libhoney.go:568       0x5924b4                91002002                ADD $8, R0, R2                                                          
  libhoney.go:568       0x5924b8                f94047e3                MOVD 136(RSP), R3                                                       
  libhoney.go:568       0x5924bc                97ebb341                CALL runtime.gcWriteBarrier(SB)                                         
  libhoney.go:569       0x5924c0                3900bfff                MOVB ZR, 47(RSP)                                                        
  libhoney.go:569       0x5924c4                f9402bfa                MOVD 80(RSP), R26                                                       
  libhoney.go:569       0x5924c8                f9400340                MOVD (R26), R0                                                          
  libhoney.go:569       0x5924cc                d63f0000                CALL (R0)                                                               
  libhoney.go:569       0x5924d0                a97ffbfd                LDP -8(RSP), (R29, R30)                                                 
  libhoney.go:569       0x5924d4                910183ff                ADD $96, RSP, RSP                                                       
  libhoney.go:569       0x5924d8                d65f03c0                RET                                                                     
  libhoney.go:569       0x5924dc                97eadced                CALL runtime.deferreturn(SB)                                            
  libhoney.go:569       0x5924e0                a97ffbfd                LDP -8(RSP), (R29, R30)                                                 
  libhoney.go:569       0x5924e4                910183ff                ADD $96, RSP, RSP                                                       
  libhoney.go:569       0x5924e8                d65f03c0                RET                                                                     
  libhoney.go:565       0x5924ec                f90007e0                MOVD R0, 8(RSP)                                                         
  libhoney.go:565       0x5924f0                f9000be1                MOVD R1, 16(RSP)                                                        
  libhoney.go:565       0x5924f4                f9000fe2                MOVD R2, 24(RSP)                                                        
  libhoney.go:565       0x5924f8                f90013e3                MOVD R3, 32(RSP)                                                        
  libhoney.go:565       0x5924fc                f90017e4                MOVD R4, 40(RSP)                                                        
  libhoney.go:565       0x592500                aa1e03e3                MOVD R30, R3                                                            
  libhoney.go:565       0x592504                97ebaa2f                CALL runtime.morestack_noctxt.abi0(SB)                                  
  libhoney.go:565       0x592508                f94007e0                MOVD 8(RSP), R0                                                         
  libhoney.go:565       0x59250c                f9400be1                MOVD 16(RSP), R1                                                        
  libhoney.go:565       0x592510                f9400fe2                MOVD 24(RSP), R2                                                        
  libhoney.go:565       0x592514                f94013e3                MOVD 32(RSP), R3                                                        
  libhoney.go:565       0x592518                f94017e4                MOVD 40(RSP), R4                                                        
  libhoney.go:565       0x59251c                17ffffb9                JMP github.com/honeycombio/libhoney-go.(*fieldHolder).AddField(SB)      
TEXT runtime.morestack_noctxt.abi0(SB) /usr/local/go/src/runtime/asm_arm64.s
  asm_arm64.s:323       0x7cdc0                 2a1f03fa                MOVW ZR, R26                    
  asm_arm64.s:324       0x7cdc4                 17ffffdf                JMP runtime.morestack.abi0(SB)  
  asm_arm64.s:324       0x7cdc8                 00000000                ?                               
  asm_arm64.s:324       0x7cdcc                 00000000                ?                               
TEXT runtime.evacuate_faststr(SB) /usr/local/go/src/runtime/map_faststr.go
  map_faststr.go:402    0x24900                 f9400b90                MOVD 16(R28), R16                       
  map_faststr.go:402    0x24904                 d100c3f1                SUB $48, RSP, R17                       
  map_faststr.go:402    0x24908                 eb10023f                CMP R16, R17                            
  map_faststr.go:402    0x2490c                 54001cc9                BLS 230(PC)                             
  map_faststr.go:402    0x24910                 f8150ffe                MOVD.W R30, -176(RSP)                   
  map_faststr.go:402    0x24914                 f81f83fd                MOVD R29, -8(RSP)                       
  map_faststr.go:402    0x24918                 d10023fd                SUB $8, RSP, R29                        
  map_faststr.go:403    0x2491c                 39402424                MOVBU 9(R1), R4                         
  map.go:1118           0x24920                 d1000485                SUB $1, R4, R5                          
  map.go:1111           0x24924                 39402026                MOVBU 8(R1), R6                         
  map.go:1111           0x24928                 927d00c6                AND $8, R6, R6                          
  map.go:1111           0x2492c                 6b1f00df                CMPW ZR, R6                             
  map.go:1120           0x24930                 9a851084                CSEL NE, R4, R5, R4                     
  map_faststr.go:403    0x24934                 f9400c25                MOVD 24(R1), R5                         
  map_faststr.go:403    0x24938                 7940a406                MOVHU 82(R0), R6                        
  map_faststr.go:404    0x2493c                 d503201f                NOOP                                    
  map_faststr.go:405    0x24940                 d503201f                NOOP                                    
  stubs.go:18           0x24944                 9b061445                MADD R6, R5, R2, R5                     
  map.go:186            0x24948                 92401484                AND $63, R4, R4                         
  map.go:186            0x2494c                 b24003e6                ORR $1, ZR, R6                          
  map.go:186            0x24950                 9ac420c4                LSL R4, R6, R4                          
  map.go:204            0x24954                 394000a6                MOVBU (R5), R6                          
  map.go:205            0x24958                 d10008c6                SUB $2, R6, R6                          
  map.go:205            0x2495c                 d3401cc6                UBFX $0, R6, $8, R6                     
  map.go:205            0x24960                 71000cdf                CMPW $3, R6                             
  map.go:1117           0x24964                 540003e3                BCC 31(PC)                              
  map_faststr.go:410    0x24968                 a906ffff                STP (ZR, ZR), 104(RSP)                  
  map_faststr.go:410    0x2496c                 a907ffff                STP (ZR, ZR), 120(RSP)                  
  map_faststr.go:410    0x24970                 a908ffff                STP (ZR, ZR), 136(RSP)                  
  map_faststr.go:410    0x24974                 a909ffff                STP (ZR, ZR), 152(RSP)                  
  map_faststr.go:412    0x24978                 f9400826                MOVD 16(R1), R6                         
  map_faststr.go:412    0x2497c                 7940a407                MOVHU 82(R0), R7                        
  stubs.go:18           0x24980                 9b071846                MADD R7, R6, R2, R6                     
  map_faststr.go:412    0x24984                 f90037e6                MOVD R6, 104(RSP)                       
  map_faststr.go:413    0x24988                 f94037e6                MOVD 104(RSP), R6                       
  stubs.go:18           0x2498c                 910020c7                ADD $8, R6, R7                          
  map_faststr.go:413    0x24990                 f9003fe7                MOVD R7, 120(RSP)                       
  stubs.go:18           0x24994                 910220c6                ADD $136, R6, R6                        
  map_faststr.go:414    0x24998                 f90043e6                MOVD R6, 128(RSP)                       
  map.go:1111           0x2499c                 39402026                MOVBU 8(R1), R6                         
  map_faststr.go:416    0x249a0                 37180166                TBNZ $3, R6, 11(PC)                     
  map_faststr.go:420    0x249a4                 f9400826                MOVD 16(R1), R6                         
  map_faststr.go:420    0x249a8                 8b020087                ADD R2, R4, R7                          
  map_faststr.go:420    0x249ac                 7940a408                MOVHU 82(R0), R8                        
  stubs.go:18           0x249b0                 9b071906                MADD R7, R6, R8, R6                     
  map_faststr.go:420    0x249b4                 f90047e6                MOVD R6, 136(RSP)                       
  map_faststr.go:421    0x249b8                 f94047e6                MOVD 136(RSP), R6                       
  stubs.go:18           0x249bc                 910020c7                ADD $8, R6, R7                          
  map_faststr.go:421    0x249c0                 f9004fe7                MOVD R7, 152(RSP)                       
  stubs.go:18           0x249c4                 910220c6                ADD $136, R6, R6                        
  map_faststr.go:422    0x249c8                 f90053e6                MOVD R6, 160(RSP)                       
  map_faststr.go:402    0x249cc                 f90067e2                MOVD R2, 200(RSP)                       
  map.go:186            0x249d0                 f9001fe4                MOVD R4, 56(RSP)                        
  map_faststr.go:402    0x249d4                 f9005fe0                MOVD R0, 184(RSP)                       
  map_faststr.go:402    0x249d8                 f90063e1                MOVD R1, 192(RSP)                       
  map_faststr.go:402    0x249dc                 14000010                JMP 16(PC)                              
  map_faststr.go:482    0x249e0                 f9401023                MOVD 32(R1), R3                         
  map_faststr.go:482    0x249e4                 eb03005f                CMP R3, R2                              
  map_faststr.go:482    0x249e8                 540000c1                BNE 6(PC)                               
  map_faststr.go:483    0x249ec                 aa0403e2                MOVD R4, R2                             
  map_faststr.go:402    0x249f0                 aa0003e3                MOVD R0, R3                             
  map_faststr.go:483    0x249f4                 aa0103e0                MOVD R1, R0                             
  map_faststr.go:483    0x249f8                 aa0303e1                MOVD R3, R1                             
  map_faststr.go:483    0x249fc                 97fff385                CALL runtime.advanceEvacuationMark(SB)  
  map_faststr.go:485    0x24a00                 a97ffbfd                LDP -8(RSP), (R29, R30)                 
  map_faststr.go:485    0x24a04                 9102c3ff                ADD $176, RSP, RSP                      
  map_faststr.go:485    0x24a08                 d65f03c0                RET                                     
  map.go:209            0x24a0c                 7940a406                MOVHU 82(R0), R6                        
  stubs.go:18           0x24a10                 8b0600a6                ADD R6, R5, R6                          
  stubs.go:18           0x24a14                 d10020c6                SUB $8, R6, R6                          
  map.go:209            0x24a18                 f94000c5                MOVD (R6), R5                           
  map_faststr.go:425    0x24a1c                 b4000105                CBZ R5, 8(PC)                           
  map_faststr.go:425    0x24a20                 f90033e5                MOVD R5, 96(RSP)                        
  map_faststr.go:426    0x24a24                 d503201f                NOOP                                    
  map_faststr.go:427    0x24a28                 d503201f                NOOP                                    
  stubs.go:18           0x24a2c                 910020a6                ADD $8, R5, R6                          
  stubs.go:18           0x24a30                 910220a7                ADD $136, R5, R7                        
  stubs.go:18           0x24a34                 d2800003                MOVD $0, R3                             
  map_faststr.go:428    0x24a38                 14000018                JMP 24(PC)                              
  map_faststr.go:472    0x24a3c                 39402023                MOVBU 8(R1), R3                         
  map_faststr.go:472    0x24a40                 370ffd03                TBNZ $1, R3, -24(PC)                    
  map_faststr.go:472    0x24a44                 f9402003                MOVD 64(R0), R3                         
  map_faststr.go:472    0x24a48                 f9400463                MOVD 8(R3), R3                          
  map_faststr.go:472    0x24a4c                 b4fffca3                CBZ R3, -27(PC)                         
  map_faststr.go:473    0x24a50                 f9400c23                MOVD 24(R1), R3                         
  map_faststr.go:473    0x24a54                 7940a404                MOVHU 82(R0), R4                        
  map_faststr.go:476    0x24a58                 d503201f                NOOP                                    
  map_faststr.go:477    0x24a5c                 d1002085                SUB $8, R4, R5                          
  stubs.go:18           0x24a60                 9b020c83                MADD R2, R3, R4, R3                     
  stubs.go:18           0x24a64                 91002063                ADD $8, R3, R3                          
  map_faststr.go:478    0x24a68                 aa0303e0                MOVD R3, R0                             
  map_faststr.go:478    0x24a6c                 aa0503e1                MOVD R5, R1                             
  map_faststr.go:478    0x24a70                 9400015c                CALL runtime.memclrHasPointers(SB)      
  map_faststr.go:483    0x24a74                 f9405fe0                MOVD 184(RSP), R0                       
  map_faststr.go:482    0x24a78                 f94063e1                MOVD 192(RSP), R1                       
  map_faststr.go:482    0x24a7c                 f94067e2                MOVD 200(RSP), R2                       
  map_faststr.go:483    0x24a80                 f9401fe4                MOVD 56(RSP), R4                        
  map_faststr.go:478    0x24a84                 17ffffd7                JMP -41(PC)                             
  map_faststr.go:428    0x24a88                 91000463                ADD $1, R3, R3                          
  map_faststr.go:428    0x24a8c                 39414409                MOVBU 81(R0), R9                        
  stubs.go:18           0x24a90                 910040c6                ADD $16, R6, R6                         
  stubs.go:18           0x24a94                 8b0900e7                ADD R9, R7, R7                          
  map_faststr.go:428    0x24a98                 f100207f                CMP $8, R3                              
  map_faststr.go:428    0x24a9c                 54fffb8a                BGE -36(PC)                             
  map_faststr.go:429    0x24aa0                 386368a8                MOVBU (R5)(R3), R8                      
  map.go:112            0x24aa4                 aa0803e9                MOVD R8, R9                             
  map.go:112            0x24aa8                 7100053f                CMPW $1, R9                             
  map_faststr.go:430    0x24aac                 54000088                BHI 4(PC)                               
  map_faststr.go:431    0x24ab0                 b27e03e8                ORR $4, ZR, R8                          
  map_faststr.go:431    0x24ab4                 382368a8                MOVB R8, (R5)(R3)                       
  map_faststr.go:432    0x24ab8                 17fffff4                JMP -12(PC)                             
  map_faststr.go:434    0x24abc                 7100153f                CMPW $5, R9                             
  map_faststr.go:434    0x24ac0                 54000e83                BCC 116(PC)                             
  map_faststr.go:428    0x24ac4                 f9001be3                MOVD R3, 48(RSP)                        
  map_faststr.go:429    0x24ac8                 3900bfe8                MOVB R8, 47(RSP)                        
  map_faststr.go:428    0x24acc                 f9002be7                MOVD R7, 80(RSP)                        
  map_faststr.go:459    0x24ad0                 f90027e6                MOVD R6, 72(RSP)                        
  map.go:1111           0x24ad4                 39402029                MOVBU 8(R1), R9                         
  map_faststr.go:438    0x24ad8                 36180069                TBZ $3, R9, 3(PC)                       
  map_faststr.go:438    0x24adc                 d2800009                MOVD $0, R9                             
  map_faststr.go:438    0x24ae0                 14000015                JMP 21(PC)                              
  map_faststr.go:441    0x24ae4                 f940241a                MOVD 72(R0), R26                        
  map_faststr.go:441    0x24ae8                 f9400342                MOVD (R26), R2                          
  map_faststr.go:441    0x24aec                 b9400c23                MOVWU 12(R1), R3                        
  map_faststr.go:441    0x24af0                 aa0603e0                MOVD R6, R0                             
  map_faststr.go:441    0x24af4                 aa0303e1                MOVD R3, R1                             
  map_faststr.go:441    0x24af8                 d63f0040                CALL (R2)                               
  map_faststr.go:442    0x24afc                 f9401fe2                MOVD 56(RSP), R2                        
  map_faststr.go:442    0x24b00                 8a020003                AND R2, R0, R3                          
  map_faststr.go:442    0x24b04                 eb1f007f                CMP ZR, R3                              
  map_faststr.go:442    0x24b08                 9a9f07e3                CSET NE, R3                             
  map_faststr.go:461    0x24b0c                 f9405fe0                MOVD 184(RSP), R0                       
  map_faststr.go:451    0x24b10                 f94063e1                MOVD 192(RSP), R1                       
  map_faststr.go:482    0x24b14                 f94067e2                MOVD 200(RSP), R2                       
  map_faststr.go:442    0x24b18                 f9401fe4                MOVD 56(RSP), R4                        
  map_faststr.go:447    0x24b1c                 f94033e5                MOVD 96(RSP), R5                        
  map_faststr.go:459    0x24b20                 f94027e6                MOVD 72(RSP), R6                        
  map_faststr.go:461    0x24b24                 f9402be7                MOVD 80(RSP), R7                        
  map_faststr.go:456    0x24b28                 3940bfe8                MOVBU 47(RSP), R8                       
  map_faststr.go:447    0x24b2c                 aa0303e9                MOVD R3, R9                             
  map_faststr.go:447    0x24b30                 f9401be3                MOVD 48(RSP), R3                        
  map_faststr.go:447    0x24b34                 9100092a                ADD $2, R9, R10                         
  map_faststr.go:447    0x24b38                 382368aa                MOVB R10, (R5)(R3)                      
  map_faststr.go:448    0x24b3c                 d3401d2a                UBFX $0, R9, $8, R10                    
  map_faststr.go:448    0x24b40                 f100095f                CMP $2, R10                             
  map_faststr.go:448    0x24b44                 54000a02                BCS 80(PC)                              
  map_faststr.go:448    0x24b48                 d37b1d29                UBFIZ $5, R9, $8, R9                    
  map_faststr.go:448    0x24b4c                 9101a3ea                ADD $104, RSP, R10                      
  map_faststr.go:448    0x24b50                 8b09014b                ADD R9, R10, R11                        
  map_faststr.go:448    0x24b54                 f9002feb                MOVD R11, 88(RSP)                       
  map_faststr.go:450    0x24b58                 f940056c                MOVD 8(R11), R12                        
  map_faststr.go:450    0x24b5c                 f100219f                CMP $8, R12                             
  map_faststr.go:450    0x24b60                 54000381                BNE 28(PC)                              
  map_faststr.go:448    0x24b64                 f90023e9                MOVD R9, 64(RSP)                        
  map_faststr.go:451    0x24b68                 f8696942                MOVD (R10)(R9), R2                      
  map_faststr.go:402    0x24b6c                 aa0003e3                MOVD R0, R3                             
  map_faststr.go:451    0x24b70                 aa0103e0                MOVD R1, R0                             
  map_faststr.go:451    0x24b74                 aa0303e1                MOVD R3, R1                             
  map_faststr.go:451    0x24b78                 97ffe8c6                CALL runtime.(*hmap).newoverflow(SB)    
  map_faststr.go:451    0x24b7c                 f94023e3                MOVD 64(RSP), R3                        
  map_faststr.go:451    0x24b80                 9101a3e4                ADD $104, RSP, R4                       
  map_faststr.go:451    0x24b84                 f8236880                MOVD R0, (R4)(R3)                       
  map_faststr.go:452    0x24b88                 f9402fe5                MOVD 88(RSP), R5                        
  map_faststr.go:452    0x24b8c                 f90004bf                MOVD ZR, 8(R5)                          
  stubs.go:18           0x24b90                 91002006                ADD $8, R0, R6                          
  map_faststr.go:453    0x24b94                 f90008a6                MOVD R6, 16(R5)                         
  stubs.go:18           0x24b98                 91022006                ADD $136, R0, R6                        
  map_faststr.go:454    0x24b9c                 f9000ca6                MOVD R6, 24(R5)                         
  map_faststr.go:461    0x24ba0                 f9405fe0                MOVD 184(RSP), R0                       
  map.go:1111           0x24ba4                 f94063e1                MOVD 192(RSP), R1                       
  map_faststr.go:482    0x24ba8                 f94067e2                MOVD 200(RSP), R2                       
  map_faststr.go:442    0x24bac                 f9401fe4                MOVD 56(RSP), R4                        
  map_faststr.go:429    0x24bb0                 f94033e5                MOVD 96(RSP), R5                        
  map_faststr.go:459    0x24bb4                 f94027e6                MOVD 72(RSP), R6                        
  map_faststr.go:461    0x24bb8                 f9402be7                MOVD 80(RSP), R7                        
  map_faststr.go:456    0x24bbc                 3940bfe8                MOVBU 47(RSP), R8                       
  map_faststr.go:456    0x24bc0                 aa0303e9                MOVD R3, R9                             
  map_faststr.go:448    0x24bc4                 9101a3ea                ADD $104, RSP, R10                      
  map_faststr.go:456    0x24bc8                 f9402feb                MOVD 88(RSP), R11                       
  map_faststr.go:428    0x24bcc                 f9401be3                MOVD 48(RSP), R3                        
  map_faststr.go:456    0x24bd0                 f8696949                MOVD (R10)(R9), R9                      
  map_faststr.go:456    0x24bd4                 3980013b                MOVB (R9), R27                          
  map_faststr.go:456    0x24bd8                 f940056a                MOVD 8(R11), R10                        
  map_faststr.go:456    0x24bdc                 9240094a                AND $7, R10, R10                        
  map_faststr.go:456    0x24be0                 382a6928                MOVB R8, (R9)(R10)                      
  map_faststr.go:459    0x24be4                 f9400968                MOVD 16(R11), R8                        
  map_faststr.go:459    0x24be8                 f94000c9                MOVD (R6), R9                           
  map_faststr.go:459    0x24bec                 f94004ca                MOVD 8(R6), R10                         
  map_faststr.go:459    0x24bf0                 f900050a                MOVD R10, 8(R8)                         
  map_faststr.go:459    0x24bf4                 b0016e1b                ADRP 47976448(PC), R27                  
  map_faststr.go:459    0x24bf8                 9124837b                ADD $2336, R27, R27                     
  map_faststr.go:459    0x24bfc                 b940036a                MOVWU (R27), R10                        
  map_faststr.go:459    0x24c00                 3500006a                CBNZW R10, 3(PC)                        
  map_faststr.go:459    0x24c04                 f9000109                MOVD R9, (R8)                           
  map_faststr.go:459    0x24c08                 14000004                JMP 4(PC)                               
  map_faststr.go:459    0x24c0c                 aa0803e2                MOVD R8, R2                             
  map_faststr.go:459    0x24c10                 aa0903e3                MOVD R9, R3                             
  map_faststr.go:459    0x24c14                 9401696b                CALL runtime.gcWriteBarrier(SB)         
  map_faststr.go:461    0x24c18                 f9401c03                MOVD 56(R0), R3                         
  map_faststr.go:461    0x24c1c                 f9400d61                MOVD 24(R11), R1                        
  map_faststr.go:461    0x24c20                 aa0703e2                MOVD R7, R2                             
  map_faststr.go:461    0x24c24                 aa0303e0                MOVD R3, R0                             
  map_faststr.go:461    0x24c28                 9400002a                CALL runtime.typedmemmove(SB)           
  map_faststr.go:462    0x24c2c                 f9402fe3                MOVD 88(RSP), R3                        
  map_faststr.go:462    0x24c30                 f9400464                MOVD 8(R3), R4                          
  map_faststr.go:462    0x24c34                 91000484                ADD $1, R4, R4                          
  map_faststr.go:462    0x24c38                 f9000464                MOVD R4, 8(R3)                          
  map_faststr.go:467    0x24c3c                 f9400864                MOVD 16(R3), R4                         
  stubs.go:18           0x24c40                 91004084                ADD $16, R4, R4                         
  map_faststr.go:467    0x24c44                 f9000864                MOVD R4, 16(R3)                         
  map_faststr.go:468    0x24c48                 f9400c64                MOVD 24(R3), R4                         
  map_faststr.go:468    0x24c4c                 f9405fe1                MOVD 184(RSP), R1                       
  map_faststr.go:468    0x24c50                 39414425                MOVBU 81(R1), R5                        
  stubs.go:18           0x24c54                 8b050084                ADD R5, R4, R4                          
  map_faststr.go:468    0x24c58                 f9000c64                MOVD R4, 24(R3)                         
  map_faststr.go:428    0x24c5c                 aa0103e0                MOVD R1, R0                             
  map.go:1111           0x24c60                 f94063e1                MOVD 192(RSP), R1                       
  map_faststr.go:482    0x24c64                 f94067e2                MOVD 200(RSP), R2                       
  map_faststr.go:428    0x24c68                 f9401be3                MOVD 48(RSP), R3                        
  map_faststr.go:442    0x24c6c                 f9401fe4                MOVD 56(RSP), R4                        
  map_faststr.go:429    0x24c70                 f94033e5                MOVD 96(RSP), R5                        
  stubs.go:18           0x24c74                 f94027e6                MOVD 72(RSP), R6                        
  stubs.go:18           0x24c78                 f9402be7                MOVD 80(RSP), R7                        
  stubs.go:18           0x24c7c                 b27e03e8                ORR $4, ZR, R8                          
  map_faststr.go:468    0x24c80                 17ffff82                JMP -126(PC)                            
  map_faststr.go:448    0x24c84                 aa0a03e0                MOVD R10, R0                            
  map_faststr.go:448    0x24c88                 b27f03e1                ORR $2, ZR, R1                          
  map_faststr.go:448    0x24c8c                 94016a11                CALL runtime.panicIndex(SB)             
  map_faststr.go:435    0x24c90                 9000cec0                ADRP 27099136(PC), R0                   
  map_faststr.go:435    0x24c94                 91030000                ADD $192, R0, R0                        
  map_faststr.go:435    0x24c98                 d28001a1                MOVD $13, R1                            
  map_faststr.go:435    0x24c9c                 940097dd                CALL runtime.throw(SB)                  
  map_faststr.go:435    0x24ca0                 d503201f                NOOP                                    
  map_faststr.go:402    0x24ca4                 f90007e0                MOVD R0, 8(RSP)                         
  map_faststr.go:402    0x24ca8                 f9000be1                MOVD R1, 16(RSP)                        
  map_faststr.go:402    0x24cac                 f9000fe2                MOVD R2, 24(RSP)                        
  map_faststr.go:402    0x24cb0                 aa1e03e3                MOVD R30, R3                            
  map_faststr.go:402    0x24cb4                 94016043                CALL runtime.morestack_noctxt.abi0(SB)  
  map_faststr.go:402    0x24cb8                 f94007e0                MOVD 8(RSP), R0                         
  map_faststr.go:402    0x24cbc                 f9400be1                MOVD 16(RSP), R1                        
  map_faststr.go:402    0x24cc0                 f9400fe2                MOVD 24(RSP), R2                        
  map_faststr.go:402    0x24cc4                 17ffff0f                JMP runtime.evacuate_faststr(SB)        
  map_faststr.go:402    0x24cc8                 00000000                ?                                       
  map_faststr.go:402    0x24ccc                 00000000                ?                                       

lizthegrey avatar Aug 19 '22 16:08 lizthegrey

Thanks! Could you also see the disassembly for 0x7d000 and 0x355e44 in traceback 4?

prattmic avatar Aug 19 '22 16:08 prattmic

TEXT runtime.morestack_noctxt.abi0(SB) /usr/local/go/src/runtime/asm_arm64.s
  asm_arm64.s:323       0x7d000                 2a1f03fa                MOVW ZR, R26                    
  asm_arm64.s:324       0x7d004                 17ffffdf                JMP runtime.morestack.abi0(SB)  
  asm_arm64.s:324       0x7d008                 00000000                ?                               
  asm_arm64.s:324       0x7d00c                 00000000                ?                               
TEXT github.com/klauspost/compress/zstd.(*frameDec).checkCRC(SB) /mnt/ramdisk/go/pkg/mod/github.com/klauspost/[email protected]/zstd/framedec.go
  framedec.go:297       0x355d40                f9400b90                MOVD 16(R28), R16                                                               
  framedec.go:297       0x355d44                eb3063ff                CMP R16, RSP                                                                    
  framedec.go:297       0x355d48                54000789                BLS 60(PC)                                                                      
  framedec.go:297       0x355d4c                f81c0ffe                MOVD.W R30, -64(RSP)                                                            
  framedec.go:297       0x355d50                f81f83fd                MOVD R29, -8(RSP)                                                               
  framedec.go:297       0x355d54                d10023fd                SUB $8, RSP, R29                                                                
  framedec.go:298       0x355d58                39482002                MOVBU 520(R0), R2                                                               
  framedec.go:298       0x355d5c                36000642                TBZ $0, R2, 50(PC)                                                              
  framedec.go:297       0x355d60                f90027e0                MOVD R0, 72(RSP)                                                                
  framedec.go:303       0x355d64                f940e802                MOVD 464(R0), R2                                                                
  framedec.go:303       0x355d68                f940ec03                MOVD 472(R0), R3                                                                
  framedec.go:303       0x355d6c                f9401442                MOVD 40(R2), R2                                                                 
  framedec.go:303       0x355d70                b27e03e1                ORR $4, ZR, R1                                                                  
  framedec.go:303       0x355d74                aa0303e0                MOVD R3, R0                                                                     
  framedec.go:303       0x355d78                d63f0040                CALL (R2)                                                                       
  framedec.go:304       0x355d7c                b50004a3                CBNZ R3, 37(PC)                                                                 
  framedec.go:309       0x355d80                f94027e2                MOVD 72(RSP), R2                                                                
  framedec.go:309       0x355d84                3940e043                MOVBU 56(R2), R3                                                                
  framedec.go:309       0x355d88                370003a3                TBNZ $0, R3, 29(PC)                                                             
  bytes.go:20           0x355d8c                f90017e1                MOVD R1, 40(RSP)                                                                
  bytes.go:20           0x355d90                f9001be0                MOVD R0, 48(RSP)                                                                
  framedec.go:313       0x355d94                b90027ff                MOVW ZR, 36(RSP)                                                                
  framedec.go:314       0x355d98                f9402040                MOVD 64(R2), R0                                                                 
  framedec.go:314       0x355d9c                97ff8c75                CALL github.com/klauspost/compress/zstd/internal/xxhash.(*Digest).Sum64(SB)     
  framedec.go:319       0x355da0                b90027e0                MOVW R0, 36(RSP)                                                                
  bytes.go:20           0x355da4                f94017e1                MOVD 40(RSP), R1                                                                
  bytes.go:20           0x355da8                f100103f                CMP $4, R1                                                                      
  bytes.go:20           0x355dac                540000c1                BNE 6(PC)                                                                       
  bytes.go:20           0x355db0                910093e0                ADD $36, RSP, R0                                                                
  bytes.go:20           0x355db4                f9401be1                MOVD 48(RSP), R1                                                                
  bytes.go:20           0x355db8                b27e03e2                ORR $4, ZR, R2                                                                  
  bytes.go:20           0x355dbc                97f2f341                CALL runtime.memequal(SB)                                                       
  framedec.go:321       0x355dc0                37000140                TBNZ $0, R0, 10(PC)                                                             
  framedec.go:325       0x355dc4                f0013ffb                ADRP 41938944(PC), R27                                                          
  framedec.go:325       0x355dc8                911fc37b                ADD $2032, R27, R27                                                             
  framedec.go:325       0x355dcc                f9400360                MOVD (R27), R0                                                                  
  framedec.go:325       0x355dd0                f0013ffb                ADRP 41938944(PC), R27                                                          
  framedec.go:325       0x355dd4                911fe37b                ADD $2040, R27, R27                                                             
  framedec.go:325       0x355dd8                f9400361                MOVD (R27), R1                                                                  
  framedec.go:325       0x355ddc                a97ffbfd                LDP -8(RSP), (R29, R30)                                                         
  framedec.go:325       0x355de0                910103ff                ADD $64, RSP, RSP                                                               
  framedec.go:325       0x355de4                d65f03c0                RET                                                                             
  framedec.go:330       0x355de8                d2800000                MOVD $0, R0                                                                     
  framedec.go:330       0x355dec                d2800001                MOVD $0, R1                                                                     
  framedec.go:330       0x355df0                a97ffbfd                LDP -8(RSP), (R29, R30)                                                         
  framedec.go:330       0x355df4                910103ff                ADD $64, RSP, RSP                                                               
  framedec.go:330       0x355df8                d65f03c0                RET                                                                             
  framedec.go:310       0x355dfc                d2800000                MOVD $0, R0                                                                     
  framedec.go:310       0x355e00                d2800001                MOVD $0, R1                                                                     
  framedec.go:310       0x355e04                a97ffbfd                LDP -8(RSP), (R29, R30)                                                         
  framedec.go:310       0x355e08                910103ff                ADD $64, RSP, RSP                                                               
  framedec.go:310       0x355e0c                d65f03c0                RET                                                                             
  framedec.go:306       0x355e10                aa0303e0                MOVD R3, R0                                                                     
  framedec.go:306       0x355e14                aa0403e1                MOVD R4, R1                                                                     
  framedec.go:306       0x355e18                a97ffbfd                LDP -8(RSP), (R29, R30)                                                         
  framedec.go:306       0x355e1c                910103ff                ADD $64, RSP, RSP                                                               
  framedec.go:306       0x355e20                d65f03c0                RET                                                                             
  framedec.go:299       0x355e24                d2800000                MOVD $0, R0                                                                     
  framedec.go:299       0x355e28                d2800001                MOVD $0, R1                                                                     
  framedec.go:299       0x355e2c                a97ffbfd                LDP -8(RSP), (R29, R30)                                                         
  framedec.go:299       0x355e30                910103ff                ADD $64, RSP, RSP                                                               
  framedec.go:299       0x355e34                d65f03c0                RET                                                                             
  framedec.go:297       0x355e38                f90007e0                MOVD R0, 8(RSP)                                                                 
  framedec.go:297       0x355e3c                aa1e03e3                MOVD R30, R3                                                                    
  framedec.go:297       0x355e40                97f49c70                CALL runtime.morestack_noctxt.abi0(SB)                                          
  framedec.go:297       0x355e44                f94007e0                MOVD 8(RSP), R0                                                                 
  framedec.go:297       0x355e48                17ffffbe                JMP github.com/klauspost/compress/zstd.(*frameDec).checkCRC(SB)                 
  framedec.go:297       0x355e4c                00000000                ?                                                                               

lizthegrey avatar Aug 19 '22 16:08 lizthegrey

btw as is implied, that's github.com/klauspost/compress v1.15.9 for versioning/repro purposes

lizthegrey avatar Aug 19 '22 16:08 lizthegrey

diff --git a/src/runtime/asm_arm64.s b/src/runtime/asm_arm64.s
index 7836ba1d96..ec29cd2038 100644
--- a/src/runtime/asm_arm64.s
+++ b/src/runtime/asm_arm64.s
@@ -321,6 +321,7 @@ TEXT runtime·morestack(SB),NOSPLIT|NOFRAME,$0-0
 
 TEXT runtime·morestack_noctxt(SB),NOSPLIT|NOFRAME,$0-0
        MOVW    $0, R26
+       MOVD    RSP, RSP // force SPWITE
        B runtime·morestack(SB)
 
 // spillArgs stores return values from registers to a *internal/abi.RegArgs in R20.

@lizthegrey Could you try if this patch makes any difference? Thanks!

cherrymui avatar Aug 19 '22 16:08 cherrymui

I think I made the hourly cutoff for our build train, will report if I see any crashes this afternoon or over the weekend; if no crashes by Monday, then we can assume that was the fix.

lizthegrey avatar Aug 19 '22 18:08 lizthegrey

Unfortunately I do see a crash in the shepherd binary (same as from tracebacks 1-3), same PC -- 0x7cdc0. didn't manage to get to the pod before Kubernetes GCed it though, and for some reason log shipping is broken in that environment.

It'll likely happen again over weekend though, if it happened within 3 hours of deploy of the putative fix

lizthegrey avatar Aug 19 '22 23:08 lizthegrey

@lizthegrey have you seen more failures and do you have any logs to share? Could you doublecheck that the binary includes the patch? Thanks!

cherrymui avatar Aug 22 '22 20:08 cherrymui

@lizthegrey have you seen more failures and do you have any logs to share? Could you doublecheck that the binary includes the patch? Thanks!

No failures over weekend. Yes, 100% sure the crash from Fri includes the patch, and I can manually scrape the logs from another system, it'll just take some time and thus hoping it happens somewhere else where I can grab the logs automatically instead.

lizthegrey avatar Aug 22 '22 20:08 lizthegrey

https://share.getcloudapp.com/p9uQpJww -- no crashes today either. the crash on Friday must have been a fluke, but at least we have the telemetry to detect it if it does happen over the next few weeks. I'll still work on reconstructing the logs from the Friday crash immediately post-deploy though.

lizthegrey avatar Aug 23 '22 04:08 lizthegrey

share.getcloudapp.com/p9uQpJww -- no crashes today either. the crash on Friday must have been a fluke, but at least we have the telemetry to detect it if it does happen over the next few weeks. I'll still work on reconstructing the logs from the Friday crash immediately post-deploy though.

Found the problem -- the pod was a lingering holdover of the previous build, despite running chronologically long after the new release should have gone. It did not have the patch. You are good to propose the MOVD RSP, RSP patch for 1.20 release and for 1.19 and for 1.18 backport trains.

thank you thank you thank you @cherrymui and @prattmic for your patience.

lizthegrey avatar Aug 23 '22 17:08 lizthegrey

Thanks @lizthegrey ! This is great. I'll send a CL soon. And thank you very much for providing the debug logs and helping us find the issue.

cherrymui avatar Aug 23 '22 17:08 cherrymui

Change https://go.dev/cl/425396 mentions this issue: runtime: mark morestack_noctxt SPWRITE on LR architectures

gopherbot avatar Aug 24 '22 22:08 gopherbot

@gopherbot please backport this to previous releases. This may cause a runtime crash. Thanks.

cherrymui avatar Aug 25 '22 15:08 cherrymui

Backport issue(s) opened: #54674 (for 1.18), #54675 (for 1.19).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

gopherbot avatar Aug 25 '22 15:08 gopherbot

Change https://go.dev/cl/425615 mentions this issue: [release-branch.go1.19] runtime: mark morestack_noctxt SPWRITE on LR architectures

gopherbot avatar Aug 25 '22 20:08 gopherbot

Change https://go.dev/cl/425616 mentions this issue: [release-branch.go1.18] runtime: mark morestack_noctxt SPWRITE on LR architectures

gopherbot avatar Aug 25 '22 20:08 gopherbot