flannel icon indicating copy to clipboard operation
flannel copied to clipboard

signal SIGSEGV: segmentation violation

Open hellobinge opened this issue 6 years ago • 10 comments

I recently upgrade my ubuntu kernel, and now I am not able to run flanneld when i run flanneld, i got signal SIGSEGV: segmentation violation

i tested v 0.11.0 and v 0.10.0 and v 0.9.1

both v 0.11.0 and v 0.10.0 got signal SIGSEGV: segmentation violation but v 0.9.1 works well

I think this issue maybe related to #977 https://github.com/coreos/flannel/issues/977

but #977 issue is closed and solved. I am not sure if the root is the same.

operating system: Linux exciting65 4.18.0-15-generic #16~18.04.1-Ubuntu SMP Thu Feb 7 14:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux etcd version: {"etcdserver":"3.3.9","etcdcluster":"3.3.0"}

here is the error information:

I0219 20:09:13.975022 4704 main.go:514] Determining IP address of default interface I0219 20:09:13.975197 4704 main.go:527] Using interface with name enp7s0 and address 10.160.110.23 I0219 20:09:13.975207 4704 main.go:544] Defaulting external address to interface address (10.160.110.23) I0219 20:09:13.975277 4704 main.go:244] Created subnet manager: Etcd Local Manager with Previous Subnet: 10.67.5.0/24 I0219 20:09:13.975283 4704 main.go:247] Installing signal handlers fatal error: unexpected signal during runtime execution [signal SIGSEGV: segmentation violation code=0x1 addr=0x63 pc=0x7faa4698f448]

runtime stack: runtime.throw(0x19a3a14, 0x2a) /usr/local/go/src/runtime/panic.go:616 +0x81 runtime.sigpanic() /usr/local/go/src/runtime/signal_unix.go:372 +0x28e

goroutine 56 [syscall]: runtime.cgocall(0x1441ee0, 0xc4200675f8, 0x29) /usr/local/go/src/runtime/cgocall.go:128 +0x64 fp=0xc4200675b8 sp=0xc420067580 pc=0x402304 net._C2func_getaddrinfo(0xc4200dfd80, 0x0, 0xc42028ef60, 0xc4201681b0, 0x0, 0x0, 0x0) _cgo_gotypes.go:86 +0x55 fp=0xc4200675f8 sp=0xc4200675b8 pc=0x51d535 net.cgoLookupIPCNAME.func1(0xc4200dfd80, 0x0, 0xc42028ef60, 0xc4201681b0, 0x17, 0x17, 0xc4201b5bc0) /usr/local/go/src/net/cgo_unix.go:149 +0x13b fp=0xc420067640 sp=0xc4200675f8 pc=0x52429b net.cgoLookupIPCNAME(0xc4200df9a0, 0x16, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) /usr/local/go/src/net/cgo_unix.go:149 +0x174 fp=0xc420067738 sp=0xc420067640 pc=0x51eba4 net.cgoIPLookup(0xc4201ab860, 0xc4200df9a0, 0x16) /usr/local/go/src/net/cgo_unix.go:201 +0x4d fp=0xc4200677c8 sp=0xc420067738 pc=0x51f26d runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc4200677d0 sp=0xc4200677c8 pc=0x456f71 created by net.cgoLookupIP /usr/local/go/src/net/cgo_unix.go:211 +0xaf

goroutine 1 [select]: github.com/coreos/flannel/vendor/github.com/coreos/etcd/client.(*simpleHTTPClient).Do(0xc4204115e0, 0x7faa44077610, 0xc4202575c0, 0x1acb460, 0xc42028e870, 0x0, 0x0, 0x0, 0x0, 0x0, ...) /go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/client.go:531 +0x2e9 github.com/coreos/flannel/vendor/github.com/coreos/etcd/client.(*redirectFollowingHTTPClient).Do(0xc4205366a0, 0x7faa44077610, 0xc4202575c0, 0x1acb460, 0xc42028e870, 0x7ffe1559330d, 0x1b, 0x0, 0x0, 0x0, ...) /go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/client.go:603 +0xad github.com/coreos/flannel/vendor/github.com/coreos/etcd/client.(*httpClusterClient).Do(0xc4201ab1a0, 0x7faa44077610, 0xc4202575c0, 0x1acb460, 0xc42028e870, 0xc4205ddab0, 0x4115e8, 0x3, 0x17cbba0, 0xc42028e601, ...) /go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/client.go:360 +0x346 github.com/coreos/flannel/vendor/github.com/coreos/etcd/client.(*httpKeysAPI).Get(0xc420536640, 0x7faa44077610, 0xc4202575c0, 0xc4200df920, 0x1a, 0xc4200c4988, 0x44e3a3, 0x17bf4e0, 0xc40000a258) /go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/keys.go:422 +0xd9 github.com/coreos/flannel/subnet/etcdv2.(*etcdSubnetRegistry).getNetworkConfig(0xc42028e6f0, 0x7faa44077610, 0xc4202575c0, 0x4, 0x40e26d, 0xc42022c000, 0x7faa44077610) /go/src/github.com/coreos/flannel/subnet/etcdv2/registry.go:117 +0x125 github.com/coreos/flannel/subnet/etcdv2.(*LocalManager).GetNetworkConfig(0xc420536660, 0x7faa44077610, 0xc4202575c0, 0x17bf518, 0xc4205ddc38, 0x40e114) /go/src/github.com/coreos/flannel/subnet/etcdv2/local_manager.go:88 +0x4b main.getConfig(0x7faa44077610, 0xc4202575c0, 0x1ae4ea0, 0xc420536660, 0xc4201ab200, 0xc4203082b0, 0xc4200c4a00) /go/src/github.com/coreos/flannel/main.go:380 +0xa8 main.main() /go/src/github.com/coreos/flannel/main.go:271 +0x528

goroutine 20 [syscall]: os/signal.signal_recv(0x0) /usr/local/go/src/runtime/sigqueue.go:139 +0xa6 os/signal.loop() /usr/local/go/src/os/signal/signal_unix.go:22 +0x22 created by os/signal.init.0 /usr/local/go/src/os/signal/signal_unix.go:28 +0x41

goroutine 21 [chan receive]: github.com/coreos/flannel/vendor/github.com/golang/glog.(*loggingT).flushDaemon(0x27a1f40) /go/src/github.com/coreos/flannel/vendor/github.com/golang/glog/glog.go:879 +0x8b created by github.com/coreos/flannel/vendor/github.com/golang/glog.init.0 /go/src/github.com/coreos/flannel/vendor/github.com/golang/glog/glog.go:410 +0x203

goroutine 50 [select, locked to thread]: runtime.gopark(0x19f75b0, 0x0, 0x197b171, 0x6, 0x18, 0x1) /usr/local/go/src/runtime/proc.go:291 +0x11a runtime.selectgo(0xc420065f50, 0xc4200ae2a0) /usr/local/go/src/runtime/select.go:392 +0xe50 runtime.ensureSigM.func1() /usr/local/go/src/runtime/signal_unix.go:549 +0x1f4 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:2361 +0x1

goroutine 51 [select]: main.shutdownHandler(0x7faa44077610, 0xc4202575c0, 0xc4201ab200, 0xc4203082b0) /go/src/github.com/coreos/flannel/main.go:364 +0xde main.main.func1(0x7faa44077610, 0xc4202575c0, 0xc4201ab200, 0xc4203082b0, 0xc4200c4a00) /go/src/github.com/coreos/flannel/main.go:261 +0x49 created by main.main /go/src/github.com/coreos/flannel/main.go:260 +0x4e6

goroutine 52 [select]: net/http.(*Transport).getConn(0xc42026c000, 0xc42028ea20, 0x0, 0xc4200aa2a0, 0x4, 0xc4200df9a0, 0x1b, 0x0, 0x0, 0x0) /usr/local/go/src/net/http/transport.go:962 +0x558 net/http.(*Transport).RoundTrip(0xc42026c000, 0xc420340800, 0x0, 0x0, 0x0) /usr/local/go/src/net/http/transport.go:409 +0x632 github.com/coreos/flannel/vendor/github.com/coreos/etcd/client.(*simpleHTTPClient).Do.func1(0xc4204115e0, 0xc420340800, 0xc4201ab320) /go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/client.go:523 +0x41 created by github.com/coreos/flannel/vendor/github.com/coreos/etcd/client.(*simpleHTTPClient).Do /go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/client.go:522 +0x1e3

goroutine 53 [select]: net.(*Resolver).LookupIPAddr(0x27a0000, 0x1adfb40, 0xc4201ab6e0, 0xc4200df9a0, 0x16, 0x0, 0x0, 0x0, 0x0, 0x0) /usr/local/go/src/net/lookup.go:212 +0x50d net.(*Resolver).internetAddrList(0x27a0000, 0x1adfb40, 0xc4201ab6e0, 0x197861f, 0x3, 0xc4200df9a0, 0x1b, 0x0, 0x0, 0x0, ...) /usr/local/go/src/net/ipsock.go:293 +0x5c4 net.(*Resolver).resolveAddrList(0x27a0000, 0x1adfb40, 0xc4201ab6e0, 0x1978c7d, 0x4, 0x197861f, 0x3, 0xc4200df9a0, 0x1b, 0x0, ...) /usr/local/go/src/net/dial.go:193 +0x50c net.(*Dialer).DialContext(0xc4201ab0e0, 0x1adfb00, 0xc4200c4010, 0x197861f, 0x3, 0xc4200df9a0, 0x1b, 0x0, 0x0, 0x0, ...) /usr/local/go/src/net/dial.go:375 +0x22b net.(*Dialer).Dial(0xc4201ab0e0, 0x197861f, 0x3, 0xc4200df9a0, 0x1b, 0x60, 0x110, 0x110, 0xc4203d86c0) /usr/local/go/src/net/dial.go:320 +0x75 net.(*Dialer).Dial-fm(0x197861f, 0x3, 0xc4200df9a0, 0x1b, 0xc4201ab680, 0xc420077ae8, 0x4041c6, 0x61c829) /go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/client.go:52 +0x52 net/http.(*Transport).dial(0xc42026c000, 0x1adfb00, 0xc4200c4010, 0x197861f, 0x3, 0xc4200df9a0, 0x1b, 0x5d, 0xffffffffffffffff, 0x0, ...) /usr/local/go/src/net/http/transport.go:901 +0x78 net/http.(*Transport).dialConn(0xc42026c000, 0x1adfb00, 0xc4200c4010, 0x0, 0xc4200aa2a0, 0x4, 0xc4200df9a0, 0x1b, 0xc42026c000, 0xc420340800, ...) /usr/local/go/src/net/http/transport.go:1143 +0x317 net/http.(*Transport).getConn.func4(0xc42026c000, 0x1adfb00, 0xc4200c4010, 0xc42028ea50, 0xc4200ae540) /usr/local/go/src/net/http/transport.go:957 +0x78 created by net/http.(*Transport).getConn /usr/local/go/src/net/http/transport.go:956 +0x363

goroutine 55 [select]: net.cgoLookupIP(0x1adfac0, 0xc420257680, 0xc4200df9a0, 0x16, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) /usr/local/go/src/net/cgo_unix.go:212 +0x19f net.(*Resolver).lookupIP(0x27a0000, 0x1adfac0, 0xc420257680, 0xc4200df9a0, 0x16, 0x0, 0xc4202f5500, 0xc4201b5bc0, 0x0, 0x0) /usr/local/go/src/net/lookup_unix.go:95 +0x12d net.(*Resolver).(net.lookupIP)-fm(0x1adfac0, 0xc420257680, 0xc4200df9a0, 0x16, 0x429189, 0x8, 0xc4201b5bc0, 0x0, 0xc4200676a0) /usr/local/go/src/net/lookup.go:192 +0x56 net.glob..func10(0x1adfac0, 0xc420257680, 0xc4203084a0, 0xc4200df9a0, 0x16, 0x0, 0x0, 0x0, 0x0, 0x0) /usr/local/go/src/net/hook.go:19 +0x52 net.(*Resolver).LookupIPAddr.func1(0x0, 0x0, 0x0, 0x0) /usr/local/go/src/net/lookup.go:206 +0xd8 internal/singleflight.(*Group).doCall(0x279fff0, 0xc420167220, 0xc4200df9a0, 0x16, 0xc42028eb70) /usr/local/go/src/internal/singleflight/singleflight.go:95 +0x2e created by internal/singleflight.(*Group).DoChan /usr/local/go/src/internal/singleflight/singleflight.go:88 +0x2d0

hellobinge avatar Feb 20 '19 04:02 hellobinge

same issue here, I tried with exactly same versions you describe and happens the same, 0.9.1 works but not others, Tried 0.10 in other hosts and works. Only a single host have problems.

What info you need about the machine?

segator avatar Feb 20 '19 18:02 segator

Exactly the same problem here. I was using 0.10.0 before, but that crashed on startup due to #977 . #1016 was supposed to fix this - but it didn't for some reason?

onitake avatar Feb 26 '19 10:02 onitake

I built 0.11.0 using Go 1.11, but the segfault persists. Edit: It's the same with Go 1.8.7.

go spits out the following messages:

# github.com/coreos/flannel
/usr/bin/ld: /tmp/go-link-114095991/000022.o: in function `mygetgrouplist':
/build/golang-1.11-U2p3Pq/golang-1.11-1.11.5/src/os/user/getgrouplist_unix.go:16: warning: Using 'getgrouplist' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /tmp/go-link-114095991/000021.o: in function `mygetgrgid_r':
/build/golang-1.11-U2p3Pq/golang-1.11-1.11.5/src/os/user/cgo_lookup_unix.go:38: warning: Using 'getgrgid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /tmp/go-link-114095991/000021.o: in function `mygetgrnam_r':
/build/golang-1.11-U2p3Pq/golang-1.11-1.11.5/src/os/user/cgo_lookup_unix.go:43: warning: Using 'getgrnam_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /tmp/go-link-114095991/000021.o: in function `mygetpwnam_r':
/build/golang-1.11-U2p3Pq/golang-1.11-1.11.5/src/os/user/cgo_lookup_unix.go:33: warning: Using 'getpwnam_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /tmp/go-link-114095991/000021.o: in function `mygetpwuid_r':
/build/golang-1.11-U2p3Pq/golang-1.11-1.11.5/src/os/user/cgo_lookup_unix.go:28: warning: Using 'getpwuid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /tmp/go-link-114095991/000004.o: in function `_cgo_18049202ccd9_C2func_getaddrinfo':
/tmp/go-build/cgo-gcc-prolog:49: warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking

Perhaps incompatible glibc versions are the root of the problem?

If this is really a bug/race condition in Go, it was never fixed.

onitake avatar Feb 26 '19 10:02 onitake

Some more details on the involved systems:

Build: Debian buster (gcc 8.2.0, glibc 2.28, go 1.11.5) Build 2: Docker build according to Makefile, tested with go 1.8.7, 1.10 and 1.11.5

Host: CentOS 7 (glibc 2.17)

onitake avatar Mar 01 '19 08:03 onitake

Same here - with slightly different stack trace (see below).

Ubuntu 18.04.2 LTS Kernel: 4.15.0-46-generic libc6: 2.27-3ubuntu1

Fails: 0.11.0 and 0.10.0 Works: 0.9.1

I0310 15:19:18.959145   24981 main.go:475] Determining IP address of default interface
I0310 15:19:18.959431   24981 main.go:488] Using interface with name enp1s0 and address 192.168.122.75
I0310 15:19:18.959495   24981 main.go:505] Defaulting external address to interface address (192.168.122.75)
I0310 15:19:18.959642   24981 main.go:235] Created subnet manager: Etcd Local Manager with Previous Subnet: None
I0310 15:19:18.959710   24981 main.go:238] Installing signal handlers
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x63 pc=0x7f2d8cc20448]

runtime stack:
runtime.throw(0x1a7f7cf, 0x2a)
	/usr/local/go/src/runtime/panic.go:605 +0x95
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:351 +0x2b8

goroutine 10 [syscall, locked to thread]:
runtime.cgocall(0x15273b0, 0xc42002bde8, 0x1a7d72e)
	/usr/local/go/src/runtime/cgocall.go:132 +0xe4 fp=0xc42002bda8 sp=0xc42002bd68 pc=0x402514
net._C2func_getaddrinfo(0x7f2d800008c0, 0x0, 0xc4203760f0, 0xc420482080, 0x0, 0x0, 0x0)
	net/_obj/_cgo_gotypes.go:86 +0x5f fp=0xc42002bde8 sp=0xc42002bda8 pc=0x52081f
net.cgoLookupIPCNAME.func2(0x7f2d800008c0, 0x0, 0xc4203760f0, 0xc420482080, 0xc42005cc60, 0xc420349920, 0x10)
	/usr/local/go/src/net/cgo_unix.go:151 +0x13f fp=0xc42002be40 sp=0xc42002bde8 pc=0x527dbf
net.cgoLookupIPCNAME(0xc420349920, 0x10, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/cgo_unix.go:151 +0x175 fp=0xc42002bf38 sp=0xc42002be40 pc=0x522075
net.cgoIPLookup(0xc42005cde0, 0xc420349920, 0x10)
	/usr/local/go/src/net/cgo_unix.go:203 +0x4d fp=0xc42002bfc8 sp=0xc42002bf38 pc=0x5227bd
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc42002bfd0 sp=0xc42002bfc8 pc=0x4598f1
created by net.cgoLookupIP
	/usr/local/go/src/net/cgo_unix.go:213 +0xaf

goroutine 1 [select]:
github.com/coreos/flannel/vendor/github.com/coreos/etcd/client.(*simpleHTTPClient).Do(0xc4203cfae0, 0x7f2d8c272a08, 0xc4201d6840, 0x2838420, 0xc42033d1a0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/client.go:531 +0x31d
github.com/coreos/flannel/vendor/github.com/coreos/etcd/client.(*redirectFollowingHTTPClient).Do(0xc4201bf2e0, 0x7f2d8c272a08, 0xc4201d6840, 0x2838420, 0xc42033d1a0, 0x7ffcf7291755, 0x15, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/client.go:603 +0xb2
github.com/coreos/flannel/vendor/github.com/coreos/etcd/client.(*httpClusterClient).Do(0xc42005c780, 0x7f2d8c272a08, 0xc4201d6840, 0x2838420, 0xc42033d1a0, 0xc420601a88, 0xc420601aa8, 0x4119c8, 0x3, 0x18b1f20, ...)
	/go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/client.go:360 +0x36c
github.com/coreos/flannel/vendor/github.com/coreos/etcd/client.(*httpKeysAPI).Get(0xc4201bf240, 0x7f2d8c272a08, 0xc4201d6840, 0xc4203498e0, 0x1a, 0xc4204471c9, 0xc420404401, 0x3, 0x4)
	/go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/keys.go:422 +0xe1
github.com/coreos/flannel/subnet/etcdv2.(*etcdSubnetRegistry).getNetworkConfig(0xc42033d050, 0x7f2d8c272a08, 0xc4201d6840, 0x7f2d8c272a08, 0x453cd0, 0xc420601b68, 0x1859480)
	/go/src/github.com/coreos/flannel/subnet/etcdv2/registry.go:117 +0x125
github.com/coreos/flannel/subnet/etcdv2.(*LocalManager).GetNetworkConfig(0xc4201bf280, 0x7f2d8c272a08, 0xc4201d6840, 0x1, 0x28d52e0, 0x7f2d8c272a08)
	/go/src/github.com/coreos/flannel/subnet/etcdv2/local_manager.go:88 +0x4b
main.getConfig(0x7f2d8c272a08, 0xc4201d6840, 0x284ffc0, 0xc4201bf280, 0xc42005c7e0, 0xc420404470, 0xc420447210)
	/go/src/github.com/coreos/flannel/main.go:347 +0xb2
main.main()
	/go/src/github.com/coreos/flannel/main.go:262 +0x595

goroutine 19 [syscall]:
os/signal.signal_recv(0x0)
	/usr/local/go/src/runtime/sigqueue.go:131 +0xa6
os/signal.loop()
	/usr/local/go/src/os/signal/signal_unix.go:22 +0x22
created by os/signal.init.0
	/usr/local/go/src/os/signal/signal_unix.go:28 +0x41

goroutine 20 [chan receive]:
github.com/coreos/flannel/vendor/github.com/golang/glog.(*loggingT).flushDaemon(0x28a94c0)
	/go/src/github.com/coreos/flannel/vendor/github.com/golang/glog/glog.go:879 +0x9f
created by github.com/coreos/flannel/vendor/github.com/golang/glog.init.0
	/go/src/github.com/coreos/flannel/vendor/github.com/golang/glog/glog.go:410 +0x203

goroutine 5 [select, locked to thread]:
runtime.gopark(0x1ad26a0, 0x0, 0x1a58035, 0x6, 0x18, 0x1)
	/usr/local/go/src/runtime/proc.go:287 +0x12c
runtime.selectgo(0xc420027f50, 0xc4200662a0)
	/usr/local/go/src/runtime/select.go:395 +0x1149
runtime.ensureSigM.func1()
	/usr/local/go/src/runtime/signal_unix.go:511 +0x220
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:2337 +0x1

goroutine 6 [runnable]:
main.main.func1(0x7f2d8c272a08, 0xc4201d6840, 0xc42005c7e0, 0xc420404470, 0xc420447210)
	/go/src/github.com/coreos/flannel/main.go:251
created by main.main
	/go/src/github.com/coreos/flannel/main.go:251 +0x553

goroutine 7 [select]:
net/http.(*Transport).getConn(0xc420346000, 0xc42033d2f0, 0x0, 0xc4201e42a0, 0x4, 0xc420349920, 0x15, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/http/transport.go:948 +0x5bf
net/http.(*Transport).RoundTrip(0xc420346000, 0xc42046c400, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/http/transport.go:400 +0x6a6
github.com/coreos/flannel/vendor/github.com/coreos/etcd/client.(*simpleHTTPClient).Do.func1(0xc4203cfae0, 0xc42046c400, 0xc42005c8a0)
	/go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/client.go:523 +0x41
created by github.com/coreos/flannel/vendor/github.com/coreos/etcd/client.(*simpleHTTPClient).Do
	/go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/client.go:522 +0x200

goroutine 8 [select]:
net.(*Resolver).LookupIPAddr(0x28a7580, 0x284d920, 0xc42005cc60, 0xc420349920, 0x10, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/lookup.go:196 +0x52b
net.(*Resolver).internetAddrList(0x28a7580, 0x284d920, 0xc42005cc60, 0x1a55550, 0x3, 0xc420349920, 0x15, 0x0, 0x0, 0x0, ...)
	/usr/local/go/src/net/ipsock.go:293 +0x644
net.(*Resolver).resolveAddrList(0x28a7580, 0x284d920, 0xc42005cc60, 0x1a55bae, 0x4, 0x1a55550, 0x3, 0xc420349920, 0x15, 0x0, ...)
	/usr/local/go/src/net/dial.go:193 +0x594
net.(*Dialer).DialContext(0xc42005c6c0, 0x284d8e0, 0xc420072010, 0x1a55550, 0x3, 0xc420349920, 0x15, 0x0, 0x0, 0x0, ...)
	/usr/local/go/src/net/dial.go:375 +0x248
net.(*Dialer).Dial(0xc42005c6c0, 0x1a55550, 0x3, 0xc420349920, 0x15, 0x240020066660, 0x110, 0x110, 0xc420266120)
	/usr/local/go/src/net/dial.go:320 +0x75
net.(*Dialer).Dial-fm(0x1a55550, 0x3, 0xc420349920, 0x15, 0xc420404550, 0xc42003d998, 0x404409, 0x60)
	/go/src/github.com/coreos/flannel/vendor/github.com/coreos/etcd/client/client.go:52 +0x52
net/http.(*Transport).dial(0xc420346000, 0x284d8e0, 0xc420072010, 0x1a55550, 0x3, 0xc420349920, 0x15, 0x0, 0x0, 0x0, ...)
	/usr/local/go/src/net/http/transport.go:887 +0x7b
net/http.(*Transport).dialConn(0xc420346000, 0x284d8e0, 0xc420072010, 0x0, 0xc4201e42a0, 0x4, 0xc420349920, 0x15, 0xc420346000, 0xc42046c400, ...)
	/usr/local/go/src/net/http/transport.go:1060 +0x1d62
net/http.(*Transport).getConn.func4(0xc420346000, 0x284d8e0, 0xc420072010, 0xc42033d320, 0xc420066540)
	/usr/local/go/src/net/http/transport.go:943 +0x78
created by net/http.(*Transport).getConn
	/usr/local/go/src/net/http/transport.go:942 +0x393

goroutine 9 [select]:
net.cgoLookupIP(0x284d920, 0xc42005cc60, 0xc420349920, 0x10, 0xc42033d2f0, 0x0, 0xc4201e42a0, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/cgo_unix.go:214 +0x1b0
net.(*Resolver).lookupIP(0x28a7580, 0x284d920, 0xc42005cc60, 0xc420349920, 0x10, 0x0, 0x17ebf00, 0xc42033d0b0, 0x0, 0x0)
	/usr/local/go/src/net/lookup_unix.go:95 +0x12d
net.(*Resolver).(net.lookupIP)-fm(0x284d920, 0xc42005cc60, 0xc420349920, 0x10, 0x0, 0x0, 0x6a1d50, 0xc420346000, 0x0)
	/usr/local/go/src/net/lookup.go:187 +0x56
net.glob..func10(0x284d920, 0xc42005cc60, 0xc420404590, 0xc420349920, 0x10, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/hook.go:19 +0x52
net.(*Resolver).LookupIPAddr.func1(0xc420346000, 0x284d8e0, 0xc420072010, 0x0)
	/usr/local/go/src/net/lookup.go:193 +0x5c
internal/singleflight.(*Group).doCall(0x28a7570, 0xc420078dc0, 0xc420349920, 0x10, 0xc42033d440)
	/usr/local/go/src/internal/singleflight/singleflight.go:93 +0x2e
created by internal/singleflight.(*Group).DoChan
	/usr/local/go/src/internal/singleflight/singleflight.go:86 +0x31f

florath avatar Mar 10 '19 15:03 florath

This is caused by upstream golang bug which finally blames on glibc See https://github.com/golang/go/issues/30310 There is little we can do to fix it.

The workaround is to add export GODEBUG=netdns=go when you run the Go application. This force Go runtime to use pure go dns resolver instead of using operating system's API (eg: getaddrinfo ) to resolve DNS.

If you are deploying flannel in k8s. Add the following definition into daemonset yaml:

env:
- name: GODEBUG
  value: netdns=go

Limitations of pure go resolver (You'd better read this before doing workaround) :

From go source code

Name Resolution

The method for resolving domain names, whether indirectly with functions like Dial
or directly with functions like LookupHost and LookupAddr, varies by operating system.

On Unix systems, the resolver has two options for resolving names.
It can use a pure Go resolver that sends DNS requests directly to the servers
listed in /etc/resolv.conf, or it can use a cgo-based resolver that calls C
library routines such as getaddrinfo and getnameinfo.

By default the pure Go resolver is used, because a blocked DNS request consumes
only a goroutine, while a blocked C call consumes an operating system thread.
When cgo is available, the cgo-based resolver is used instead under a variety of
conditions: on systems that do not let programs make direct DNS requests (OS X),
when the LOCALDOMAIN environment variable is present (even if empty),
when the RES_OPTIONS or HOSTALIASES environment variable is non-empty,
when the ASR_CONFIG environment variable is non-empty (OpenBSD only),
when /etc/resolv.conf or /etc/nsswitch.conf specify the use of features that the
Go resolver does not implement, and when the name being looked up ends in .local
or is an mDNS name.

The resolver decision can be overridden by setting the netdns value of the
GODEBUG environment variable (see package runtime) to go or cgo, as in:

	export GODEBUG=netdns=go    # force pure Go resolver
	export GODEBUG=netdns=cgo   # force cgo resolver

The decision can also be forced while building the Go source tree
by setting the netgo or netcgo build tag.

A numeric netdns setting, as in GODEBUG=netdns=1, causes the resolver
to print debugging information about its decisions.
To force a particular resolver while also printing debugging information,
join the two settings by a plus sign, as in GODEBUG=netdns=go+1.

On Plan 9, the resolver always accesses /net/cs and /net/dns.

On Windows, the resolver always uses C library functions, such as GetAddrInfo and DnsQuery.

Einsfier avatar Jan 07 '21 05:01 Einsfier

I tend to ditch the native resolver in all my Go container builds. Maybe this should also be done for flannel?

Just set

export CGO_ENABLED=0

when compiling, and the resulting Go binary will be statically linked and no longer rely on the glibc resolver.

onitake avatar Jan 07 '21 12:01 onitake

I tend to ditch the native resolver in all my Go container builds. Maybe this should also be done for flannel?

Just set

export CGO_ENABLED=0

when compiling, and the resulting Go binary will be statically linked and no longer rely on the glibc resolver.

Obviously this does not apply to all Go applications. Many go apps rely on cgo to achieve it's designed purpose. Disable of CGO directly leads to complie failure.

By the way. flannel is already static-linked. This panic occurs in go runtime implementation, not flannel.

root@linux:~# ldd ./flanneld
        not a dynamic executable

Currently there is no complie-time option to completely ditch native resolver. You need to explicitly specify GODEBUG env on each run.

Einsfier avatar Jan 07 '21 12:01 Einsfier

I tried both v0.13.0 and v0.16.1, and I'm getting this entire segfault thing too.

I0118 08:03:42.789633    8072 main.go:518] Determining IP address of default interface
I0118 08:03:42.790837    8072 main.go:531] Using interface with name eth0 and address 172.20.0.5
I0118 08:03:42.790969    8072 main.go:548] Defaulting external address to interface address (172.20.0.5)
I0118 08:03:42.791199    8072 main.go:246] Created subnet manager: Etcd Local Manager with Previous Subnet: None
I0118 08:03:42.791333    8072 main.go:249] Installing signal handlers
I0118 08:03:42.795668    8072 main.go:390] Found network config - Backend type: aws-vpc
I0118 08:03:42.795793    8072 awsvpc.go:88] Backend configured as: %s{"Type": "aws-vpc"}
I0118 08:03:42.797431    8072 local_manager.go:147] Found lease (172.20.53.0/24) for current IP (172.20.0.5), reusing
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x63 pc=0x7ff76993d448]

runtime stack:
runtime.throw(0x1a32f41, 0x2a)
	/usr/local/go/src/runtime/panic.go:1116 +0x72
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:679 +0x46a
...
...

This is on a normal Ubuntu setup on AWS.

The segfault only happens while I have one of these values set. Not sure if they're wrongly formatted?

etcdctl set /coreos.com/network/config '{"Network": "172.20.48.0/20", "Backend": {"Type": "aws-vpc", "RouteTableID": ["rtb-1234censored5678"]}}'

etcdctl set /coreos.com/network/config '{"Network": "172.20.48.0/20", "Backend": {"Type": "aws-vpc"}, "RouteTableID": "rtb-1234censored5678"}'

The fix that gets around it is still:

export GODEBUG=netdns=go

AddisonG avatar Jan 18 '22 08:01 AddisonG

There is a long history around this - statically linked applications like flanneld that are dependent on glibc (tied to nested calls to dlopen()). I ran into this yesterday, doing some research I realized, another way around this is to have nscd (name service cache daemon) running on the system. This eliminated the crash. Hope this helps.

rajeshvijayarajan avatar Sep 20 '22 14:09 rajeshvijayarajan

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Mar 20 '23 05:03 stale[bot]