bcc icon indicating copy to clipboard operation
bcc copied to clipboard

tools/funclatency: run funclatency with go program cause SIGILL

Open choleraehyq opened this issue 4 years ago • 5 comments

Example go program:

package main

import (
        "math/rand"
        "time"
)

func randomsleep() {
        time.Sleep(time.Duration(rand.Int63n(1000)) * time.Millisecond)
}

func main() {
        for {
                randomsleep()
        }
}
  1. run this go program, like go build testbcc.go && ./testbcc
  2. run funclatency, like sudo ./funclatency -p $pid -i 1 '/data01/home/cholerae/testbcc:main.randomsleep'
  3. will print a few histgram, then go program will crash.
SIGILL: illegal instruction
PC=0x7fffffffe001 m=2 sigcode=0
instruction bytes: 0x75 0x73 0x72 0x2f 0x62 0x69 0x6e 0x2f 0x7a 0x73 0x68 0x0 0x0 0x0 0x0 0x0

goroutine 1 [running]:
runtime: unknown pc 0x7fffffffe001
stack: frame={sp:0xc0000ca778, fp:0x0} stack=[0xc0000ca000,0xc0000ca800)
000000c0000ca678:  0000000000000030  0000000000000048
000000c0000ca688:  000000c0000f4050  00007ffb0c28ed20
000000c0000ca698:  0000000000000000  00000000004dd6a0
000000c0000ca6a8:  00007ffb0c283108  0000000000000000
000000c0000ca6b8:  000000c0000c2040  0000000000000037
000000c0000ca6c8:  000000c0000ca718  000000000046311a <time.init+954>
000000c0000ca6d8:  000000c0000ca708  000000000040bb38 <runtime.newobject+56>
000000c0000ca6e8:  0000000000000050  0000000000430b65 <runtime.gopark+229>
000000c0000ca6f8:  00000000004819d0  000000c000010030
000000c0000ca708:  000000c0000ca748  00000000004589bf <time.Sleep+191>
000000c0000ca718:  0000000000481a10  000000c0000f4050
000000c0000ca728:  0000000000451313 <runtime.resolveTypeOff+275>  0000000000000001
000000c0000ca738:  000000c0000f4050  00000000000002ce
000000c0000ca748:  000000c0000ca768  000000000046321f <main.randomsleep+63>
000000c0000ca758:  000000002acbcf80  00000000000002ce
000000c0000ca768:  000000c0000ca778  00007fffffffe000
000000c0000ca778: <000000c0000ca7d0  0000000000430769 <runtime.main+521>
000000c0000ca788:  000000c0000f6000  0000000000000000
000000c0000ca798:  000000c0000f6000  0000000000000000
000000c0000ca7a8:  0100000000000000  0000000000000000
000000c0000ca7b8:  000000c000000180  000000c0000ca7ae
000000c0000ca7c8:  0000000000481958  0000000000000000
000000c0000ca7d8:  000000000045b6e1 <runtime.goexit+1>  0000000000000000
000000c0000ca7e8:  0000000000000000  0000000000000000
000000c0000ca7f8:  0000000000000000
runtime: unknown pc 0x7fffffffe001
stack: frame={sp:0xc0000ca778, fp:0x0} stack=[0xc0000ca000,0xc0000ca800)
000000c0000ca678:  0000000000000030  0000000000000048
000000c0000ca688:  000000c0000f4050  00007ffb0c28ed20
000000c0000ca698:  0000000000000000  00000000004dd6a0
000000c0000ca6a8:  00007ffb0c283108  0000000000000000
000000c0000ca6b8:  000000c0000c2040  0000000000000037
000000c0000ca6c8:  000000c0000ca718  000000000046311a <time.init+954>
000000c0000ca6d8:  000000c0000ca708  000000000040bb38 <runtime.newobject+56>
000000c0000ca6e8:  0000000000000050  0000000000430b65 <runtime.gopark+229>
000000c0000ca6f8:  00000000004819d0  000000c000010030
000000c0000ca708:  000000c0000ca748  00000000004589bf <time.Sleep+191>
000000c0000ca718:  0000000000481a10  000000c0000f4050
000000c0000ca728:  0000000000451313 <runtime.resolveTypeOff+275>  0000000000000001
000000c0000ca738:  000000c0000f4050  00000000000002ce
000000c0000ca748:  000000c0000ca768  000000000046321f <main.randomsleep+63>
000000c0000ca758:  000000002acbcf80  00000000000002ce
000000c0000ca768:  000000c0000ca778  00007fffffffe000
000000c0000ca778: <000000c0000ca7d0  0000000000430769 <runtime.main+521>
000000c0000ca788:  000000c0000f6000  0000000000000000
000000c0000ca798:  000000c0000f6000  0000000000000000
000000c0000ca7a8:  0100000000000000  0000000000000000
000000c0000ca7b8:  000000c000000180  000000c0000ca7ae
000000c0000ca7c8:  0000000000481958  0000000000000000
000000c0000ca7d8:  000000000045b6e1 <runtime.goexit+1>  0000000000000000
000000c0000ca7e8:  0000000000000000  0000000000000000
000000c0000ca7f8:  0000000000000000

rax    0x0
rbx    0x430b65
rcx    0xc0000ca000
rdx    0x0
rdi    0x0
rsi    0x0
rbp    0xc0000ca778
rsp    0xc0000ca778
r8     0x1
r9     0x0
r10    0x0
r11    0x206
r12    0x2
r13    0xc000000900
r14    0x80c000180000
r15    0x80c0001fffff
rip    0x7fffffffe001
rflags 0x212
cs     0x33
fs     0x0
gs     0x0

choleraehyq avatar Jul 30 '20 06:07 choleraehyq

Don't know why funclatency cause SIGILL. Does anyone have any idea?

choleraehyq avatar Jul 30 '20 06:07 choleraehyq

I forgot the details. But see https://github.com/iovisor/bcc/issues/1320 or https://github.com/golang/go/issues/22008. The funclatency will use kretprobe and it will modify return address on stack. The go runtime may modify stack in the way not compatible with what kretprobe is doing. This probably not resolved yet.

yonghong-song avatar Jul 30 '20 16:07 yonghong-song

The title and description is misleading. Uprobes do in fact work with Go programs. What does not work are uretprobes.

A uprobe is associated with a userspace program binary and offset. When a probe is added, Linux will load that program, save the old instruction at the absolute file offset and patch it with a trap instruction. When the probe is hit, the eBPF program will be executed in kernel space. This works fine with normal uprobes.

A uretprobe patches a function in a similar way at entry, but it will modify the return address on the stack to a trampoline function. Once hit, the EBPF program is executed and the instruction pointer is modified to the original return address again. If the stack changes, this will likely cause corruption and crashes.

uretprobes should not be used with Go programs

chensunny avatar May 31 '21 06:05 chensunny

Hi, I believe this problem is also hit when bluntly tracing all Deoptimization related symbols in libjvm.so, even when running HelloWorld it will cause the JVM to crash. Example for an OpenJDK build of (using configure arguments '--with-native-debug-symbols=internal'). This is likely to be a generally important issue for tracing managed runtimes when stack manipulation occurs.

funclatency path-to-libjvm.so:*Deoptimization* java HelloWorld

Avoiding tracing functions which "mess with the stack" in ways that are broken by uretprobe will stop the crash ... I guess this is motivation to potentially have a blacklist of probes that you don't want to attach.

drandynisbet avatar Jul 16 '21 13:07 drandynisbet

I found the same problem in a c++ program using boost::coroutines

ccoderr avatar Jun 10 '22 11:06 ccoderr