Perf buffer submission failing with -EOPNOTSUPP after system suspended
I'm getting a strange issue with submitting events in one of my perf buffers. Edit: I have a minimal example, posting now as a comment.
Relevant system details:
- kernel 5.3.5-arch1-1-ARCH
- bcc version "bcc-git v0.10.0.r98.ba64f031-1" from the AUR
- llvm version 9.0.0-3
For example, with the following code:
TRACEPOINT_PROBE(raw_syscalls, sys_enter)
{
....
int ret = events.perf_submit(ctx, &event, sizeof(event));
bpf_trace_printk("%d\n", ret);
return 0;
}
Everything is fine until I run systemctl suspend, wait a few minutes, then wake my computer up again. Now, the call to perf_submit fails with -95 which corresponds to EOPNOTSUPP.
From looking at the relevant kernel code, the following check is responsible:
....
if (unlikely(event->oncpu != smp_processor_id()))
return -EOPNOTSUPP;
....
Does anyone have any idea what's going on? Am I missing some strange bug in my code? Is this just something we have to live with?
Here is a minimal example:
#! /usr/bin/env python3
import os, sys
import time
from bcc import BPF
prog = """
#define SYS_EXECVE 59 /* Trace execve calls as an example */
struct event
{
u32 pid;
};
BPF_PERF_OUTPUT(events);
TRACEPOINT_PROBE(raw_syscalls, sys_enter)
{
long syscall = args->id;
u32 pid = bpf_get_current_pid_tgid() >> 32;
struct event event = {.pid = pid};
if (syscall != SYS_EXECVE)
return 0;
int ret = events.perf_submit((struct pt_regs *)args, &event, sizeof(event));
if (ret)
bpf_trace_printk("%d\\n", ret);
return 0;
}
"""
bpf = BPF(text=prog)
def on_event(cpu, data, size):
event = bpf['events'].event(data)
print(event.pid)
bpf['events'].open_perf_buffer(on_event)
# This should output a pid every time sys_execve is invoked...
# It actually starts failing quite frequently after suspending the system and waking it back up
# Check output of `cat /sys/kernel/debug/tracing/*pipe`
while True:
try:
bpf.perf_buffer_poll()
time.sleep(1)
except KeyboardInterrupt:
sys.exit()
Same problem here, kernel 5.18, Debian clang version 11.0.1-2.
Using it from Go (gobpf): https://github.com/evilsocket/opensnitch/blob/master/ebpf_prog/opensnitch-procs.c#L92
@willfindlay did you manage to fix it?
@gustavo-iniguez-goya nope, never managed to fix it. Seems to be a kernel bug. You can use the new ringbuf map as a workaround through if your kernel supports it.