kcptun icon indicating copy to clipboard operation
kcptun copied to clipboard

golang 1.14 related problem(mipsle crash, 100% cpu)

Open wangvisual opened this issue 5 years ago • 28 comments
trafficstars

20200201 version works, but for version 20200226 && 20200321, the --help may even crashes, but not always, when it crash, the log shows: root@localhost:/opt/bin/a# ./client_linux_mipsle.0321 --help runtime: pipe failed with -89 fatal error: runtime: pipe failed

goroutine 6 [running]: runtime.throw(0x40afb2, 0x14) /usr/local/go/src/runtime/panic.go:1114 +0x60 fp=0xc285e8 sp=0xc285d4 pc=0x4cd0c runtime.netpollinit() /usr/local/go/src/runtime/netpoll_epoll.go:40 +0x20c fp=0xc28620 sp=0xc285e8 pc=0x489a0 runtime.netpollGenericInit() /usr/local/go/src/runtime/netpoll.go:112 +0xa0 fp=0xc28628 sp=0xc28620 pc=0x46cf4 runtime.doaddtimer(0xc1e000, 0xc3e444) /usr/local/go/src/runtime/time.go:271 +0x1e8 fp=0xc28654 sp=0xc28628 pc=0x756a0 runtime.addInitializedTimer(0xc3e444) /usr/local/go/src/runtime/time.go:258 +0x8c fp=0xc28670 sp=0xc28654 pc=0x75480 runtime.addtimer(0xc3e444) /usr/local/go/src/runtime/time.go:248 +0x88 fp=0xc2867c sp=0xc28670 pc=0x753c4 time.startTimer(0xc3e444) /usr/local/go/src/runtime/time.go:207 +0x3c fp=0xc28684 sp=0xc2867c pc=0x751c4 time.NewTimer(0x0, 0x0, 0x0) /usr/local/go/src/time/sleep.go:93 +0xf8 fp=0xc286a8 sp=0xc28684 pc=0xead00 github.com/xtaci/kcp-go/v5.(*TimedSched).sched(0xc48060) /home/xtaci/go/pkg/mod/github.com/xtaci/kcp-go/[email protected]/timedsched.go:64 +0x5c fp=0xc287e4 sp=0xc286a8 pc=0x225e50 runtime.goexit() /usr/local/go/src/runtime/asm_mipsx.s:651 +0x4 fp=0xc287e4 sp=0xc287e4 pc=0x87d64 created by github.com/xtaci/kcp-go/v5.NewTimedSched /home/xtaci/go/pkg/mod/github.com/xtaci/kcp-go/[email protected]/timedsched.go:56 +0x118

goroutine 1 [runnable, locked to thread]: text/template/parse.init() /usr/local/go/src/text/template/parse/lex.go:75

goroutine 7 [runnable]: github.com/xtaci/kcp-go/v5.(*TimedSched).prepend(0xc48060) /home/xtaci/go/pkg/mod/github.com/xtaci/kcp-go/[email protected]/timedsched.go:93 created by github.com/xtaci/kcp-go/v5.NewTimedSched /home/xtaci/go/pkg/mod/github.com/xtaci/kcp-go/[email protected]/timedsched.go:58 +0x154

system type : Broadcom BCM4716 chip rev 1 pkg 10 processor : 0 cpu model : MIPS 74K V4.0 BogoMIPS : 239.20 cpu MHz : 480 wait instruction : no microsecond timers : yes tlb_entries : 64 extra interrupt vector : no hardware watchpoint : yes ASEs implemented : mips16 dsp shadow register sets : 1 VCED exceptions : not available VCEI exceptions : not available

Linux localhost 2.6.22.19 #20 Fri May 5 00:20:55 CEST 2017 mips GNU/Linux

wangvisual avatar Mar 22 '20 10:03 wangvisual

是的,golang1.14得timer行为有问题

xtaci avatar Mar 22 '20 14:03 xtaci

0322在服务器(linux64)和本地(darwin)之间开了一个连接,挂了一晚上没出现0321的CPU100%情况

maxwellzh avatar Mar 22 '20 14:03 maxwellzh

还是不行: root@weiw-e4200:/opt/bin/a# ./client_linux_mipsle.0322 --help NAME: kcptun - client(with SMUX)

USAGE: client_linux_mipsle.0322 [global options] command [command options] [arguments...]

VERSION: 20200322

COMMANDS: help, h Shows a list of commands or help for one command

GLOBAL OPTIONS: --localaddr value, -l value local listen address (default: ":12948") --remoteaddr value, -r value kcp server address (default: "vps:29900") --key value pre-shared secret between client and server (default: "it's a secrect") [$KCPTUN_KEY] --crypt value aes, aes-128, aes-192, salsa20, blowfish, twofish, cast5, 3des, tea, xtea, xor, sm4, none (default: "aes") --mode value profiles: fast3, fast2, fast, normal, manual (default: "fast") --conn value runtime: pipe failed with -89 fatal error: runtime: pipe failed

goroutine 6 [running]: runtime.throw(0x40afd2, 0x14) /usr/local/go/src/runtime/panic.go:1114 +0x60 fp=0xc285d0 sp=0xc285bc pc=0x4cd0c runtime.netpollinit() /usr/local/go/src/runtime/netpoll_epoll.go:40 +0x20c fp=0xc28608 sp=0xc285d0 pc=0x489a0 runtime.netpollGenericInit() /usr/local/go/src/runtime/netpoll.go:112 +0xa0 fp=0xc28610 sp=0xc28608 pc=0x46cf4 runtime.doaddtimer(0xc1e000, 0xc3e884) /usr/local/go/src/runtime/time.go:271 +0x1e8 fp=0xc2863c sp=0xc28610 pc=0x756a0 runtime.addInitializedTimer(0xc3e884) /usr/local/go/src/runtime/time.go:258 +0x8c fp=0xc28658 sp=0xc2863c pc=0x75480 runtime.addtimer(0xc3e884) /usr/local/go/src/runtime/time.go:248 +0x88 fp=0xc28664 sp=0xc28658 pc=0x753c4 time.startTimer(0xc3e884) /usr/local/go/src/runtime/time.go:207 +0x3c fp=0xc2866c sp=0xc28664 pc=0x751c4 time.NewTimer(0x0, 0x0, 0x0) /usr/local/go/src/time/sleep.go:93 +0xf8 fp=0xc28690 sp=0xc2866c pc=0xead00 github.com/xtaci/kcp-go/v5.(*TimedSched).sched(0xc48060) /home/xtaci/go/pkg/mod/github.com/xtaci/kcp-go/[email protected]/timedsched.go:64 +0x5c fp=0xc287e4 sp=0xc28690 pc=0x225f28 runtime.goexit() /usr/local/go/src/runtime/asm_mipsx.s:651 +0x4 fp=0xc287e4 sp=0xc287e4 pc=0x87d64 created by github.com/xtaci/kcp-go/v5.NewTimedSched /home/xtaci/go/pkg/mod/github.com/xtaci/kcp-go/[email protected]/timedsched.go:56 +0x118

goroutine 1 [runnable]: syscall.Syscall(0xfa4, 0x1, 0xc6c434, 0x5, 0x5, 0x2, 0x0) /usr/local/go/src/syscall/asm_linux_mipsx.s:17 +0x10 syscall.write(0x1, 0xc6c434, 0x5, 0x8, 0x0, 0x0, 0x0) /usr/local/go/src/syscall/zsyscall_linux_mipsle.go:914 +0x60 syscall.Write(...) /usr/local/go/src/syscall/syscall_unix.go:214 internal/poll.(*FD).Write(0xc3e080, 0xc6c434, 0x5, 0x8, 0x0, 0x0, 0x0) /usr/local/go/src/internal/poll/fd_unix.go:268 +0x174 os.(*File).write(...) /usr/local/go/src/os/file_unix.go:280 os.(*File).Write(0xc0c0f8, 0xc6c434, 0x5, 0x8, 0x8, 0x0, 0x0) /usr/local/go/src/os/file.go:153 +0x7c text/tabwriter.(*Writer).write0(0xc6c420, 0xc6c434, 0x5, 0x8) /usr/local/go/src/text/tabwriter/tabwriter.go:254 +0x60 text/tabwriter.(*Writer).writeN(0xc6c420, 0xc6c434, 0x8, 0x8, 0xd) /usr/local/go/src/text/tabwriter/tabwriter.go:268 +0xac text/tabwriter.(*Writer).writePadding(0xc6c420, 0xf, 0x24, 0x600) /usr/local/go/src/text/tabwriter/tabwriter.go:293 +0x114 text/tabwriter.(*Writer).writeLines(0xc6c420, 0x0, 0x0, 0x1c, 0xcdcb00) /usr/local/go/src/text/tabwriter/tabwriter.go:324 +0x30c text/tabwriter.(*Writer).format(0xc6c420, 0x0, 0x0, 0x1c, 0x0) /usr/local/go/src/text/tabwriter/tabwriter.go:409 +0x2ec text/tabwriter.(*Writer).format(0xc6c420, 0x0, 0x0, 0x1d, 0xc53300) /usr/local/go/src/text/tabwriter/tabwriter.go:403 +0x1f0 text/tabwriter.(*Writer).flushNoDefers(0xc6c420) /usr/local/go/src/text/tabwriter/tabwriter.go:520 +0xe0 text/tabwriter.(*Writer).flush(0xc6c420, 0x0, 0x0) /usr/local/go/src/text/tabwriter/tabwriter.go:502 +0x94 text/tabwriter.(*Writer).Flush(...) /usr/local/go/src/text/tabwriter/tabwriter.go:495 github.com/urfave/cli.printHelpCustom(0x47fa10, 0xc0c0f8, 0x41a1cc, 0x41f, 0x3e4080, 0xc01340, 0x0) /home/xtaci/go/pkg/mod/github.com/urfave/[email protected]/help.go:258 +0x3ac github.com/urfave/cli.printHelp(0x47fa10, 0xc0c0f8, 0x41a1cc, 0x41f, 0x3e4080, 0xc01340) /home/xtaci/go/pkg/mod/github.com/urfave/[email protected]/help.go:262 +0x68 github.com/urfave/cli.ShowAppHelp(0xc86210, 0xc86201, 0x40) /home/xtaci/go/pkg/mod/github.com/urfave/[email protected]/help.go:143 +0x19c github.com/urfave/cli.(*App).Run(0xc01340, 0xc0a040, 0x2, 0x2, 0x0, 0x0) /home/xtaci/go/pkg/mod/github.com/urfave/[email protected]/app.go:222 +0x8e0 main.main() /home/xtaci/go/src/github.com/xtaci/kcptun/client/main.go:457 +0x11dc

goroutine 7 [runnable]: github.com/xtaci/kcp-go/v5.(*TimedSched).prepend(0xc48060) /home/xtaci/go/pkg/mod/github.com/xtaci/kcp-go/[email protected]/timedsched.go:100 created by github.com/xtaci/kcp-go/v5.NewTimedSched /home/xtaci/go/pkg/mod/github.com/xtaci/kcp-go/[email protected]/timedsched.go:58 +0x154

goroutine 8 [runnable]: main.sigHandler() /home/xtaci/go/src/github.com/xtaci/kcptun/client/signal.go:18 created by main.init.0 /home/xtaci/go/src/github.com/xtaci/kcptun/client/signal.go:15 +0x44

wangvisual avatar Mar 22 '20 15:03 wangvisual

@wangvisual 你的这个问题,可否查下dmesg,或者syslog有没有异常,虽然这是 golang的bug,但errno == -89,这个不知道是啥

xtaci avatar Mar 22 '20 15:03 xtaci

EDESTADDRREQ 89 Destination address required

xtaci avatar Mar 22 '20 15:03 xtaci

log里没有可疑的信息。

wangvisual avatar Mar 22 '20 17:03 wangvisual

1.14 has one additional call nonblockingPipe() within netpollinit() compare with 1.13, the pipe() definition for mipsle was not verified.

wangvisual avatar Mar 22 '20 18:03 wangvisual

golang officially confirmed that issue, but there're still some problem with timer on amd64... some time timer hangs.

xtaci avatar Mar 23 '20 04:03 xtaci

@xtaci 试了20200322的版本,服务端还是会出现CPU占用99.9的情况,只是比20200321版本运行的时间长了,从早上9到21点。

syuez avatar Mar 23 '20 13:03 syuez

https://github.com/golang/go/issues/38023 @syuez 是的,我追查到的golang的bug

xtaci avatar Mar 23 '20 14:03 xtaci

@wangvisual can you help run this build, with patch from https://github.com/golang/go/issues/37997

kcptun-linux-mipsle-20200324.tar.gz

kcptun-linux-mips-20200324.tar.gz

xtaci avatar Mar 24 '20 02:03 xtaci

@xtaci CPU 100%的问题没有了,但是发现客户端内存好像有泄露问题

从昨天21点运行到早上8点,内存占用了80%左右

未 kill掉进程 前内存 未kill.png

运行时内存占用 空载.png

kill 掉后内存 kill掉.png

用的是TCP模式,同机子另外以UDP模式运行的客户端则没有这个问题

syuez avatar Mar 25 '20 00:03 syuez

@xtaci 从昨晚到现在一切正常,没出现CPU满载的情况

maxwellzh avatar Mar 25 '20 03:03 maxwellzh

@xtaci CPU 100%的问题没有了,但是发现客户端内存好像有泄露问题

从昨天21点运行到早上8点,内存占用了80%左右

未 kill掉进程 前内存 未kill.png

运行时内存占用 空载.png

kill 掉后内存 kill掉.png

用的是TCP模式,同机子另外以UDP模式运行的客户端则没有这个问题

TCP模式没有经过特别的优化,可能会有较大的内存消耗,空了再做优化。其他没问题就好,你可以尝试启动的时候加入 GOGC=10 ./client_xxx 这样启动。

xtaci avatar Mar 25 '20 04:03 xtaci

@xtaci 从昨晚到现在一切正常,没出现CPU满载的情况

golang可能会在下一个小版本修复此问题。https://github.com/golang/go/issues/38023

xtaci avatar Mar 25 '20 04:03 xtaci

@xtaci GOGC 怎么跟 nohup 命令配合? nohup GOGC=10 /root/client_linux_amd64 -c /root/config.json >/dev/null 2>&1 & 我尝试这样的方式但是无法运行

用sh脚本套娃好了😂

syuez avatar Mar 25 '20 04:03 syuez

@wangvisual can you help run this build, with patch from golang/go#37997

kcptun-linux-mipsle-20200324.tar.gz

kcptun-linux-mips-20200324.tar.gz 可以了,非常感谢。

wangvisual avatar Mar 25 '20 04:03 wangvisual

@xtaci ~GOGC 怎么跟 nohup 命令配合? ~nohup GOGC=10 /root/client_linux_amd64 -c /root/config.json >/dev/null 2>&1 & ~我尝试这样的方式但是无法运行~~~

用sh脚本套娃好了😂

nohup env GOGC=10 ...

wangvisual avatar Mar 25 '20 05:03 wangvisual

@xtaci ~GOGC 怎么跟 nohup 命令配合? ~nohup GOGC=10 /root/client_linux_amd64 -c /root/config.json >/dev/null 2>&1 & ~我尝试这样的方式但是无法运行~~~ 用sh脚本套娃好了😂

nohup env GOGC=10 ...

感谢!

syuez avatar Mar 25 '20 05:03 syuez

v20200409的docker版本服务端在运行一段时间之后依旧有卡死的问题(看服务端的日志:网络没问题,服务端无法收到任何packet)

qx220 avatar Apr 10 '20 12:04 qx220

@qx-juilliard docker版本可能没有更新golang版本, 我更新了一下Dockerfile,可以pull latest试试

xtaci avatar Apr 10 '20 13:04 xtaci

@xtaci CPU 100%的问题没有了,但是发现客户端内存好像有泄露问题 从昨天21点运行到早上8点,内存占用了80%左右 未 kill掉进程 前内存 未kill.png 运行时内存占用 空载.png kill 掉后内存 kill掉.png 用的是TCP模式,同机子另外以UDP模式运行的客户端则没有这个问题

TCP模式没有经过特别的优化,可能会有较大的内存消耗,空了再做优化。其他没问题就好,你可以尝试启动的时候加入 GOGC=10 ./client_xxx 这样启动。

tcp模式确实存在内存泄漏问题,附图是client进程的内存消耗,准备试试GOGC

image

tete1030 avatar Apr 28 '20 03:04 tete1030

@xtaci CPU 100%的问题没有了,但是发现客户端内存好像有泄露问题 从昨天21点运行到早上8点,内存占用了80%左右 未 kill掉进程 前内存 未kill.png 运行时内存占用 空载.png kill 掉后内存 kill掉.png 用的是TCP模式,同机子另外以UDP模式运行的客户端则没有这个问题

TCP模式没有经过特别的优化,可能会有较大的内存消耗,空了再做优化。其他没问题就好,你可以尝试启动的时候加入 GOGC=10 ./client_xxx 这样启动。

tcp模式确实存在内存泄漏问题,附图是client进程的内存消耗,准备试试GOGC

image

目前只有20200226版本好一点,之后的版本基本1天内存就爆了(VPS 512M),我现在服务器和客户端都换成0226版本了😂

syuez avatar Apr 28 '20 09:04 syuez

Not for me. Mine is related to tcp simulation mode. Normal mode has no such problem.

After GOGC=1, the situation seems a little better, but still growing memory.

image

tete1030 avatar Apr 30 '20 08:04 tete1030

@tete1030 @syuez #677 Is this related to the issue mentioned in 667 ?

If use tcp mode it will cause problem,udp will not.

syuez avatar May 01 '20 02:05 syuez

CPU usage is also high on Raspberry Pi 2.

semigodking avatar Jun 21 '20 14:06 semigodking