go icon indicating copy to clipboard operation
go copied to clipboard

runtime: fatal error: acquireSudog: found s.elem != nil in cache

Open AdamKorcz opened this issue 2 years ago • 4 comments

What version of Go are you using (go version)?

$ go version go1.17 linux/amd64

Does this issue reproduce with the latest release?

Reproducer is not yet created.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/root/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/root/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3562792503=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I encountered this bug after running a go-fuzz fuzzer for about 1 week. The fuzzer is currently not publicly available.

What did you expect to see?

I expected not to see this crash.

What did you see instead?

fatal error: acquireSudog: found s.elem != nil in cache

goroutine 1355056023 [running]:
runtime.throw(0x2319a0b, 0x2a)
	runtime/panic.go:1117 +0x74 fp=0x10c0009b4468 sp=0x10c0009b4438 pc=0x58e7b4
runtime.acquireSudog(0x10c000a91bc0)
	runtime/proc.go:383 +0x376 fp=0x10c0009b44d8 sp=0x10c0009b4468 pc=0x591af6
runtime.selectgo(0x10c0009b4768, 0x10c0009b46c0, 0x0, 0x0, 0x4, 0x10c0009b4701, 0xf06665, 0x10c0001581c0)
	runtime/select.go:298 +0xbf7 fp=0x10c0009b4610 sp=0x10c0009b44d8 pc=0x5a2957
google.golang.org/grpc/internal/transport.(*http2Server).keepalive(0x10c0006b9380)
	google.golang.org/[email protected]/internal/transport/http2_server.go:993 +0x2c5 fp=0x10c0009b47d8 sp=0x10c0009b4610 pc=0xe46d45
runtime.goexit()
	runtime/asm_amd64.s:1371 +0x1 fp=0x10c0009b47e0 sp=0x10c0009b47d8 pc=0x5ca1a1
created by google.golang.org/grpc/internal/transport.newHTTP2Server
	google.golang.org/[email protected]/internal/transport/http2_server.go:300 +0x1545

goroutine 17 [select, locked to thread]:
google.golang.org/grpc/internal/transport.(*Stream).waitOnHeader(0x10c000778120)
	google.golang.org/[email protected]/internal/transport/transport.go:322 +0xbf
google.golang.org/grpc/internal/transport.(*Stream).RecvCompress(...)
	google.golang.org/[email protected]/internal/transport/transport.go:337
google.golang.org/grpc.(*csAttempt).recvMsg(0x10c0007b9c00, 0x2a66de0, 0x10c0006f6c80, 0x0, 0x0, 0x0)
	google.golang.org/[email protected]/stream.go:943 +0x911
google.golang.org/grpc.(*clientStream).RecvMsg.func1(0x10c0007b9c00, 0x4124330, 0x0)
	google.golang.org/[email protected]/stream.go:808 +0x57
google.golang.org/grpc.(*clientStream).withRetry(0x10c000133e60, 0x10c000759830, 0x10c000759800, 0x10c000985970, 0xf0a750)
	google.golang.org/[email protected]/stream.go:666 +0x130
google.golang.org/grpc.(*clientStream).RecvMsg(0x10c000133e60, 0x2a66de0, 0x10c0006f6c80, 0x0, 0x0)
	google.golang.org/[email protected]/stream.go:807 +0x145
google.golang.org/grpc.invoke(0x2b99920, 0x10c00032de60, 0x2319327, 0x2a, 0x2a66ca0, 0x10c0006f6c00, 0x2a66de0, 0x10c0006f6c80, 0x10c0004b0700, 0x10c00089f540, ...)
	google.golang.org/[email protected]/call.go:73 +0x17f
google.golang.org/grpc.(*ClientConn).Invoke(0x10c0004b0700, 0x2b99920, 0x10c00032de60, 0x2319327, 0x2a, 0x2a66ca0, 0x10c0006f6c00, 0x2a66de0, 0x10c0006f6c80, 0x0, ...)
	google.golang.org/[email protected]/call.go:37 +0x22a
github.com/containerd/containerd/api/services/images/v1.(*imagesClient).List(0x10c0001b65d0, 0x2b99920, 0x10c00032de60, 0x10c0006f6c00, 0x0, 0x0, 0x0, 0x10c0001b65d0, 0x10c0008b2878, 0x2b36230)
	github.com/containerd/containerd/[email protected]/services/images/v1/images.pb.go:573 +0xe5
github.com/containerd/containerd.(*remoteImages).List(0x10c00009eba0, 0x2b99920, 0x10c00032de60, 0x0, 0x0, 0x0, 0x30, 0x10c00032de60, 0x10c000064000, 0x10c000064000, ...)
	github.com/containerd/containerd/image_store.go:53 +0xb7
github.com/containerd/containerd.(*Client).ListImages(0x10c0008b27e0, 0x2b99920, 0x10c00032de60, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	github.com/containerd/containerd/client.go:490 +0xa9
github.com/containerd/containerd/contrib/fuzz.FuzzContainerdImport(0x6020000080f0, 0x1, 0x1, 0x0)
	github.com/containerd/containerd/contrib/fuzz/containerd_import_fuzzer.go:188 +0x553
main.LLVMFuzzerTestOneInput(...)
	command-line-arguments/main.038934106.go:21

goroutine 20 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0x40f3b20)
	k8s.io/klog/[email protected]/klog.go:1169 +0xab
created by k8s.io/klog/v2.init.0
	k8s.io/klog/[email protected]/klog.go:420 +0xff

goroutine 62 [chan receive, 13388 minutes]:
github.com/containerd/containerd/cmd/containerd/command.App.func1(0x10c0004c8000, 0x0, 0x0)
	github.com/containerd/containerd/cmd/containerd/command/main.go:253 +0xeec
github.com/urfave/cli.HandleAction(0x287ff80, 0x2b318c8, 0x10c0004c8000, 0x0, 0x0)
	github.com/urfave/[email protected]/app.go:523 +0x1ab
github.com/urfave/cli.(*App).Run(0x10c000158000, 0x10c000051a40, 0x2, 0x2, 0x0, 0x0)
	github.com/urfave/[email protected]/app.go:285 +0x7a6
github.com/containerd/containerd/cmd/containerd/command.StartDaemonForFuzzing(0x10c000051a40, 0x2, 0x2)
	github.com/containerd/containerd/cmd/containerd/command/main.go:375 +0x56
github.com/containerd/containerd/contrib/fuzz.init.0.func1(0x10c000051a40, 0x2, 0x2)
	github.com/containerd/containerd/contrib/fuzz/containerd_import_fuzzer.go:48 +0x50
created by github.com/containerd/containerd/contrib/fuzz.init.0
	github.com/containerd/containerd/contrib/fuzz/containerd_import_fuzzer.go:43 +0x9e

goroutine 7 [select]:
github.com/containerd/containerd/cmd/containerd/command.handleSignals.func1(0x10c0005a4420, 0x10c0005a43c0, 0x2b99920, 0x10c0004642d0, 0x10c0000b6000)
	github.com/containerd/containerd/cmd/containerd/command/main_unix.go:43 +0xee
created by github.com/containerd/containerd/cmd/containerd/command.handleSignals
	github.com/containerd/containerd/cmd/containerd/command/main_unix.go:40 +0x9a

goroutine 41 [syscall]:
os/signal.signal_recv(0x2b761e8)
	runtime/sigqueue.go:168 +0xa5
os/signal.loop()
	os/signal/signal_unix.go:23 +0x45
created by os/signal.Notify.func1.1
	os/signal/signal.go:151 +0x78

goroutine 203 [sleep]:
time.Sleep(0x2540be400)
	runtime/time.go:193 +0xd8
github.com/containerd/containerd/runtime/restart/monitor.(*monitor).run(0x10c0005881d0, 0x2540be400)
	github.com/containerd/containerd/runtime/restart/monitor/monitor.go:166 +0x85
created by github.com/containerd/containerd/runtime/restart/monitor.init.0.func1
	github.com/containerd/containerd/runtime/restart/monitor/monitor.go:87 +0x167

goroutine 201 [syscall, 13388 minutes]:
syscall.Syscall6(0xe8, 0x7, 0x10c0006419a8, 0x80, 0xffffffffffffffff, 0x0, 0x0, 0x6c80d8, 0x10c00025e247, 0x1)
	syscall/asm_linux_amd64.s:43 +0x5
golang.org/x/sys/unix.EpollWait(0x7, 0x10c0006419a8, 0x80, 0x80, 0xffffffffffffffff, 0x10c000044420, 0x2, 0x0)
	golang.org/x/[email protected]/unix/zsyscall_linux_amd64.go:77 +0xaf
github.com/containerd/containerd/metrics/cgroups/v1.(*oomCollector).start(0x10c0000feb80)
	github.com/containerd/containerd/metrics/cgroups/v1/oom.go:114 +0x9b
created by github.com/containerd/containerd/metrics/cgroups/v1.newOOMCollector
	github.com/containerd/containerd/metrics/cgroups/v1/oom.go:50 +0x11e

goroutine 225 [IO wait]:
internal/poll.runtime_pollWait(0x7f4fc2ff0da0, 0x72, 0x46fcd0)
	runtime/netpoll.go:222 +0x65
internal/poll.(*pollDesc).wait(0x10c000240998, 0x72, 0x0, 0x0, 0x22d349e)
	internal/poll/fd_poll_runtime.go:87 +0x89
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Accept(0x10c000240980, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	internal/poll/fd_unix.go:401 +0x389
net.(*netFD).accept(0x10c000240980, 0x2, 0x10c000116af0, 0x5d46cc)
	net/fd_unix.go:172 +0x53
net.(*UnixListener).accept(0x10c0001e23c0, 0x10c000116b08, 0x10c000116b10, 0x28)
	net/unixsock_posix.go:162 +0x45
net.(*UnixListener).Accept(0x10c0001e23c0, 0x2b339c0, 0x10c0001581c0, 0x2b92370, 0x10c0001e23c0)
	net/unixsock.go:260 +0xa5
google.golang.org/grpc.(*Server).Serve(0x10c0001581c0, 0x2b92370, 0x10c0001e23c0, 0x0, 0x0)
	google.golang.org/[email protected]/server.go:786 +0x2bd
github.com/containerd/containerd/services/server.(*Server).ServeGRPC(0x10c0003d2a00, 0x2b92370, 0x10c0001e23c0, 0x18, 0x22dfa16)
	github.com/containerd/containerd/services/server/server.go:272 +0x94
github.com/containerd/containerd/cmd/containerd/command.serve.func1(0x2b92370, 0x10c0001e23c0, 0x10c0005d93e0, 0x2b99920, 0x10c0001e2450, 0x10c0005dc4c0, 0x1f)
	github.com/containerd/containerd/cmd/containerd/command/main.go:267 +0x94
created by github.com/containerd/containerd/cmd/containerd/command.serve
	github.com/containerd/containerd/cmd/containerd/command/main.go:265 +0x32e

goroutine 200 [select]:
github.com/containerd/containerd/gc/scheduler.(*gcScheduler).run(0x10c00008acc0, 0x2b99920, 0x10c0004642d0)
	github.com/containerd/containerd/gc/scheduler/scheduler.go:268 +0x1d1
created by github.com/containerd/containerd/gc/scheduler.init.0.func1
	github.com/containerd/containerd/gc/scheduler/scheduler.go:132 +0x485

goroutine 222 [syscall, 13388 minutes]:
syscall.Syscall6(0xe8, 0xa, 0x10c00072bbe4, 0x7, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0)
	syscall/asm_linux_amd64.s:43 +0x5
golang.org/x/sys/unix.EpollWait(0xa, 0x10c00072bbe4, 0x7, 0x7, 0xffffffffffffffff, 0x0, 0x0, 0x0)
	golang.org/x/[email protected]/unix/zsyscall_linux_amd64.go:77 +0xaf
github.com/fsnotify/fsnotify.(*fdPoller).wait(0x10c0005dc3e0, 0x0, 0x0, 0x0)
	github.com/fsnotify/[email protected]/inotify_poller.go:86 +0xbc
github.com/fsnotify/fsnotify.(*Watcher).readEvents(0x10c0001c8b40)
	github.com/fsnotify/[email protected]/inotify.go:192 +0x229
created by github.com/fsnotify/fsnotify.NewWatcher
	github.com/fsnotify/[email protected]/inotify.go:59 +0x20d

goroutine 224 [IO wait, 13388 minutes]:
internal/poll.runtime_pollWait(0x7f4fc2ff0e88, 0x72, 0x46fcd0)
	runtime/netpoll.go:222 +0x65
internal/poll.(*pollDesc).wait(0x10c000240818, 0x72, 0x0, 0x0, 0x22d349e)
	internal/poll/fd_poll_runtime.go:87 +0x89
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Accept(0x10c000240800, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	internal/poll/fd_unix.go:401 +0x389
net.(*netFD).accept(0x10c000240800, 0x10c000118b00, 0x10d7be5, 0x10c0005ca160)
	net/fd_unix.go:172 +0x53
net.(*UnixListener).accept(0x10c0001e2300, 0x34320000, 0x2b92370, 0x10c0001e2300)
	net/unixsock_posix.go:162 +0x45
net.(*UnixListener).Accept(0x10c0001e2300, 0x2b92370, 0x10c0001e2300, 0xa, 0x10c000118be8)
	net/unixsock.go:260 +0xa5
github.com/containerd/ttrpc.(*Server).Serve(0x10c0005ca150, 0x2b998b0, 0x10c0001b0000, 0x2b92370, 0x10c0001e2300, 0x0, 0x0)
	github.com/containerd/[email protected]/server.go:87 +0x13b
github.com/containerd/containerd/services/server.(*Server).ServeTTRPC(0x10c0003d2a00, 0x2b92370, 0x10c0001e2300, 0x19, 0x22cf35c)
	github.com/containerd/containerd/services/server/server.go:277 +0x70
github.com/containerd/containerd/cmd/containerd/command.serve.func1(0x2b92370, 0x10c0001e2300, 0x10c0005d9350, 0x2b99920, 0x10c0001e2390, 0x10c0005d0720, 0x25)
	github.com/containerd/containerd/cmd/containerd/command/main.go:267 +0x94
created by github.com/containerd/containerd/cmd/containerd/command.serve
	github.com/containerd/containerd/cmd/containerd/command/main.go:265 +0x32e

goroutine 259 [chan receive]:
github.com/containerd/containerd/pkg/cri/server.(*snapshotsSyncer).start.func1(0x10c0001c94f0, 0x10c0004d1b20)
	github.com/containerd/containerd/pkg/cri/server/snapshots.go:65 +0x76
created by github.com/containerd/containerd/pkg/cri/server.(*snapshotsSyncer).start
	github.com/containerd/containerd/pkg/cri/server/snapshots.go:57 +0x71

goroutine 261 [IO wait, 13388 minutes]:
internal/poll.runtime_pollWait(0x7f4fc2ff0cb8, 0x72, 0x46fcd0)
	runtime/netpoll.go:222 +0x65
internal/poll.(*pollDesc).wait(0x10c000240d18, 0x72, 0x0, 0x0, 0x22d349e)
	internal/poll/fd_poll_runtime.go:87 +0x89
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Accept(0x10c000240d00, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	internal/poll/fd_unix.go:401 +0x389
net.(*netFD).accept(0x10c000240d00, 0x10c0001c9810, 0x7f4fc1c8d100, 0x30)
	net/fd_unix.go:172 +0x53
net.(*TCPListener).accept(0x10c00027e8b8, 0x10c00011dd70, 0x56447a, 0x30)
	net/tcpsock_posix.go:139 +0x45
net.(*TCPListener).Accept(0x10c00027e8b8, 0x2a16ba0, 0x10c000287d70, 0x28c9280, 0x36f59b0)
	net/tcpsock.go:261 +0xa5
net/http.(*Server).Serve(0x10c0005ae380, 0x2b92340, 0x10c00027e8b8, 0x0, 0x0)
	net/http/server.go:2981 +0x31e
github.com/containerd/containerd/pkg/cri/streaming.(*server).Start(0x10c000310240, 0x1, 0x0, 0x0)
	github.com/containerd/containerd/pkg/cri/streaming/server.go:265 +0x1a5
github.com/containerd/containerd/pkg/cri/server.(*criService).Run.func2(0x10c0005cf440, 0x10c00032e000)
	github.com/containerd/containerd/pkg/cri/server/service.go:216 +0x90
created by github.com/containerd/containerd/pkg/cri/server.(*criService).Run
	github.com/containerd/containerd/pkg/cri/server/service.go:214 +0x538

goroutine 226 [select]:
github.com/containerd/containerd/pkg/cri/server.(*eventMonitor).start.func1(0x10c0005cf3e0, 0x10c000552b00, 0x10c000279800)
	github.com/containerd/containerd/pkg/cri/server/events.go:252 +0x136
created by github.com/containerd/containerd/pkg/cri/server.(*eventMonitor).start
	github.com/containerd/containerd/pkg/cri/server/events.go:249 +0xd3

goroutine 260 [select, 13388 minutes]:
github.com/containerd/containerd/pkg/cri/server.(*cniNetConfSyncer).syncLoop(0x10c000274e70, 0x0, 0x0)
	github.com/containerd/containerd/pkg/cri/server/cni_conf_syncer.go:75 +0xf5
github.com/containerd/containerd/pkg/cri/server.(*criService).Run.func1(0x10c0002a6300, 0x10c00032e000)
	github.com/containerd/containerd/pkg/cri/server/service.go:208 +0x6f
created by github.com/containerd/containerd/pkg/cri/server.(*criService).Run
	github.com/containerd/containerd/pkg/cri/server/service.go:206 +0x490

goroutine 199 [select, 13388 minutes]:
github.com/docker/go-events.(*Broadcaster).run(0x10c0001468c0)
	github.com/docker/[email protected]/broadcast.go:117 +0x19a
created by github.com/docker/go-events.NewBroadcaster
	github.com/docker/[email protected]/broadcast.go:39 +0x1d5

goroutine 246 [sync.Cond.Wait, 13388 minutes]:
sync.runtime_notifyListWait(0x10c00003d490, 0x10c000000000)
	runtime/sema.go:513 +0xfc
sync.(*Cond).Wait(0x10c00003d480)
	sync/cond.go:56 +0xf2
github.com/docker/go-events.(*Queue).next(0x10c00057f9b0, 0x0, 0x0)
	github.com/docker/[email protected]/queue.go:103 +0xba
github.com/docker/go-events.(*Queue).run(0x10c00057f9b0)
	github.com/docker/[email protected]/queue.go:68 +0x65
created by github.com/docker/go-events.NewQueue
	github.com/docker/[email protected]/queue.go:29 +0x11e

goroutine 223 [select, 13388 minutes]:
github.com/containerd/containerd/pkg/cri/server.(*criService).Run(0x10c00032e000, 0x2b998b0, 0x10c0001b0000)
	github.com/containerd/containerd/pkg/cri/server/service.go:227 +0x616
github.com/containerd/containerd/pkg/cri.initCRIService.func1(0x2bc15a8, 0x10c00032e000, 0x2b99920, 0x10c0004642d0)
	github.com/containerd/containerd/pkg/cri/cri.go:107 +0x62
created by github.com/containerd/containerd/pkg/cri.initCRIService
	github.com/containerd/containerd/pkg/cri/cri.go:106 +0x797

goroutine 247 [select, 13388 minutes]:
github.com/containerd/containerd/events/exchange.(*Exchange).Subscribe.func3(0x10c00057fa10, 0x10c0001d9ce0, 0x10c000047c80, 0x2b99878, 0x10c000552ac0, 0x10c0004ffd40)
	github.com/containerd/containerd/events/exchange/exchange.go:171 +0x130
created by github.com/containerd/containerd/events/exchange.(*Exchange).Subscribe
	github.com/containerd/containerd/events/exchange/exchange.go:165 +0x2cc

goroutine 1355056036 [IO wait]:
internal/poll.runtime_pollWait(0x7f4fc2ff0748, 0x72, 0x46fcd0)
	runtime/netpoll.go:222 +0x65
internal/poll.(*pollDesc).wait(0x10c000865718, 0x72, 0x8000, 0x8000, 0xffffffffffffffff)
	internal/poll/fd_poll_runtime.go:87 +0x89
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0x10c000865700, 0x10c00067c000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	internal/poll/fd_unix.go:166 +0x313
net.(*netFD).Read(0x10c000865700, 0x10c00067c000, 0x8000, 0x8000, 0x9, 0x8, 0x0)
	net/fd_posix.go:55 +0x65
net.(*conn).Read(0x10c000134508, 0x10c00067c000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	net/net.go:183 +0xd0
bufio.(*Reader).Read(0x10c000a320c0, 0x10c0008b28f8, 0x9, 0x9, 0x0, 0x0, 0x10c0009c7c68)
	bufio/bufio.go:227 +0x413
io.ReadAtLeast(0x2b5a548, 0x10c000a320c0, 0x10c0008b28f8, 0x9, 0x9, 0x9, 0x100000000000, 0x561e05, 0x46ff60)
	io/io.go:328 +0xc8
io.ReadFull(...)
	io/io.go:347
golang.org/x/net/http2.readFrameHeader(0x10c0008b28f8, 0x9, 0x9, 0x2b5a548, 0x10c000a320c0, 0x0, 0x0, 0x10c0008a88d0, 0x100000000000000)
	golang.org/x/[email protected]/http2/frame.go:237 +0x98
golang.org/x/net/http2.(*Framer).ReadFrame(0x10c0008b28c0, 0x10c0008a88d0, 0x10c0008a8800, 0x0, 0x0)
	golang.org/x/[email protected]/http2/frame.go:492 +0xc5
google.golang.org/grpc/internal/transport.(*http2Client).reader(0x10c00096c1e0)
	google.golang.org/[email protected]/internal/transport/http2_client.go:1347 +0x1f5
created by google.golang.org/grpc/internal/transport.newHTTP2Client
	google.golang.org/[email protected]/internal/transport/http2_client.go:346 +0xfdd

goroutine 1355055969 [select]:
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0x10c000552940)
	google.golang.org/[email protected]/balancer_conn_wrappers.go:69 +0xe5
created by google.golang.org/grpc.newCCBalancerWrapper
	google.golang.org/[email protected]/balancer_conn_wrappers.go:60 +0x17e

goroutine 1355056024 [IO wait]:
internal/poll.runtime_pollWait(0x7f4fc2ff0830, 0x72, 0x46fcd0)
	runtime/netpoll.go:222 +0x65
internal/poll.(*pollDesc).wait(0x10c000997c18, 0x72, 0x8000, 0x8000, 0xffffffffffffffff)
	internal/poll/fd_poll_runtime.go:87 +0x89
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0x10c000997c00, 0x10c000696000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	internal/poll/fd_unix.go:166 +0x313
net.(*netFD).Read(0x10c000997c00, 0x10c000696000, 0x8000, 0x8000, 0x9, 0x8, 0x0)
	net/fd_posix.go:55 +0x65
net.(*conn).Read(0x10c000588428, 0x10c000696000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	net/net.go:183 +0xd0
bufio.(*Reader).Read(0x10c0000e4600, 0x10c0004ab378, 0x9, 0x9, 0x0, 0x0, 0x10c000655cc8)
	bufio/bufio.go:227 +0x413
io.ReadAtLeast(0x2b5a548, 0x10c0000e4600, 0x10c0004ab378, 0x9, 0x9, 0x9, 0x100000000000, 0x561f85, 0x46ffe0)
	io/io.go:328 +0xc8
io.ReadFull(...)
	io/io.go:347
golang.org/x/net/http2.readFrameHeader(0x10c0004ab378, 0x9, 0x9, 0x2b5a548, 0x10c0000e4600, 0x0, 0x0, 0x0, 0x0)
	golang.org/x/[email protected]/http2/frame.go:237 +0x98
golang.org/x/net/http2.(*Framer).ReadFrame(0x10c0004ab340, 0x10c0008a88a0, 0x40f3760, 0x0, 0x0)
	golang.org/x/[email protected]/http2/frame.go:492 +0xc5
google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams(0x10c0006b9380, 0x10c0005906f0, 0x2b33a00)
	google.golang.org/[email protected]/internal/transport/http2_server.go:473 +0xc5
google.golang.org/grpc.(*Server).serveStreams(0x10c0001581c0, 0x2bb0078, 0x10c0006b9380)
	google.golang.org/[email protected]/server.go:918 +0x11c
google.golang.org/grpc.(*Server).handleRawConn.func1(0x10c0001581c0, 0x2bb0078, 0x10c0006b9380, 0x10c0005dc4c0, 0x1f)
	google.golang.org/[email protected]/server.go:868 +0x50
created by google.golang.org/grpc.(*Server).handleRawConn
	google.golang.org/[email protected]/server.go:867 +0x692

goroutine 1355056022 [running]:
	goroutine running on other thread; stack unavailable
created by google.golang.org/grpc/internal/transport.newHTTP2Server
	google.golang.org/[email protected]/internal/transport/http2_server.go:289 +0x151f

goroutine 1355055970 [chan receive]:
google.golang.org/grpc.(*addrConn).resetTransport(0x10c0004038c0)
	google.golang.org/[email protected]/clientconn.go:1214 +0xb71
created by google.golang.org/grpc.(*addrConn).connect
	google.golang.org/[email protected]/clientconn.go:844 +0x2c5

goroutine 1355056037 [runnable]:
google.golang.org/grpc/internal/transport.(*controlBuffer).get(0x10c00008db80, 0x1, 0x0, 0x0, 0x0, 0x0)
	google.golang.org/[email protected]/internal/transport/controlbuf.go:407 +0x1cc
google.golang.org/grpc/internal/transport.(*loopyWriter).run(0x10c000a32180, 0x0, 0x0)
	google.golang.org/[email protected]/internal/transport/controlbuf.go:527 +0x2fc
google.golang.org/grpc/internal/transport.newHTTP2Client.func3(0x10c00096c1e0)
	google.golang.org/[email protected]/internal/transport/http2_client.go:396 +0x8c
created by google.golang.org/grpc/internal/transport.newHTTP2Client
	google.golang.org/[email protected]/internal/transport/http2_client.go:394 +0x15f5

AdamKorcz avatar Sep 04 '21 11:09 AdamKorcz

This may be tough to solve without a reproduction case.

CC @aclements @mknyszek @prattmic

ianlancetaylor avatar Sep 05 '21 22:09 ianlancetaylor

getting the same issue.. !!!

kuabhish avatar Aug 08 '22 06:08 kuabhish

@ianlancetaylor @AdamKorcz Did you find any solutions to prevent this from happening? I am facing the same issue in production for my chat service which handles alot of requests per second and fetches data from redis sources and does some processing to the list struct which i created on the fly.

vignesh-codes avatar Aug 08 '22 06:08 vignesh-codes

@kuabhish @vignesh-codes Do either of you have a reproducer you can share?

prattmic avatar Aug 08 '22 14:08 prattmic

@prattmic Hey. Apologies for the delay. We actually did some changes to the function which had for loop that spawned go routine at each iterations. Each go routine fetches some data from redis and appends to a global var(not read by others).

We removed and it worked somehow.

I have given a pseudocode of the function here which caused the issue. No more crashes in our service

response := []map[string]interface{}{}
for _, j := range entries {
    wg.Add(1)
    go func() {
        defer wg.Done()
        data_bytes, err := fetchFromRedis(j)
        if err != nil {
            return
        }
        var data_mapped map[string]interface{}
        err = json.Unmarshal(data_bytes, data_mapped)
        if err != nil  {
            return
        }
        response.append(response, data_mapped)
    }
    wg.Wait()
    return response
}

I am not exactly sure why this issue was happening. We made use of redis pipe for optimisation here. Our production service has lots of traffic per minute. I hope it can help you debug the issue.

vignesh-codes avatar Aug 26 '22 19:08 vignesh-codes

Appending to a global variable without synchronization is a data race. Have you tried running with the race detector?

randall77 avatar Aug 26 '22 19:08 randall77

hmmm yeah might be coz of that.. have nt checked with race detector. I did not add mutex there since there were no read happening concurrently and was just appending

vignesh-codes avatar Aug 26 '22 20:08 vignesh-codes

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

gopherbot avatar Sep 08 '22 14:09 gopherbot