Profiler doesn't report full call stack if vfork is being used
We noticed that the profiler fails to report the full call stack, if the vfork LIBC functions is being used. The issue appears on x86_64, but not aarch64 (I only tested these two).
I did a bit of an investigation (link) and come up with a small sample C program to trigger the issue:
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <fcntl.h>
void run();
int main() {
run();
return 0;
}
void run() {
pid_t pid;
int status;
printf("Run\n");
while(true) {
usleep(1000*100);
pid = vfork();
if (pid == -1) {
perror("vfork failed");
return;
} else if (pid == 0) {
int dev_null = open("/dev/null", O_WRONLY);
dup2(dev_null, 1);
close(dev_null);
execlp("/bin/id", "id", NULL);
} else {
waitpid(pid, &status, 0);
printf("Child process finished.\n");
}
}
}
Using https://github.com/danielpacak/opentelemetry-lazybackend/ here are two example stacktraces that I observed. First the one looking good:
------------------- New Sample --------------------
Timestamp[0]: 1761136876434313775 (2025-10-22 12:41:16.434313775 +0000 UTC)
thread.name: dummy
process.executable.name: dummy
process.executable.path: /home/laborant/sample/dummy
process.pid: 9685
thread.id: 9685
---------------------------------------------------
Instrumentation: kernel, Function: queue_work_on, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: tty_insert_flip_string_and_push_buffer, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: pty_write, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: process_output_block, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: n_tty_write, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: do_tty_write, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: file_tty_write.constprop.0, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: tty_write, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: new_sync_write, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: vfs_write, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: ksys_write, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: __x64_sys_write, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: do_syscall_64, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: entry_SYSCALL_64_after_hwframe, File: , Line: 0, Column: 0
Instrumentation: native: Function: 0x931cd, File: libc.so.6
Instrumentation: native: Function: 0x931f3, File: libc.so.6
Instrumentation: native: Function: 0x10e53d, File: libc.so.6
Instrumentation: native: Function: 0x8eb74, File: libc.so.6
Instrumentation: native: Function: 0x8c9b5, File: libc.so.6
Instrumentation: native: Function: 0x8dae0, File: libc.so.6
Instrumentation: native: Function: 0x8dfff, File: libc.so.6
Instrumentation: native: Function: 0x82e79, File: libc.so.6
Instrumentation: native: Function: 0x12c6, File: dummy
Instrumentation: native: Function: 0x11d1, File: dummy
Instrumentation: native: Function: 0x27674, File: libc.so.6
Instrumentation: native: Function: 0x27728, File: libc.so.6
Instrumentation: native: Function: 0x10f4, File: dummy
And here one cut after vfork:
------------------- New Sample --------------------
Timestamp[0]: 1761136872834323100 (2025-10-22 12:41:12.8343231 +0000 UTC)
thread.name: dummy
process.executable.name: dummy
process.executable.path: /home/laborant/sample/dummy
process.pid: 9685
thread.id: 9685
---------------------------------------------------
Instrumentation: kernel, Function: selinux_cred_prepare, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: prepare_creds, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: copy_creds, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: copy_process, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: kernel_clone, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: __do_sys_vfork, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: do_syscall_64, File: , Line: 0, Column: 0
Instrumentation: kernel, Function: entry_SYSCALL_64_after_hwframe, File: , Line: 0, Column: 0
Instrumentation: native: Function: 0x103d3b, File: libc.so.6
From what I can tell, the root cause is this line in GLIBC code, that pops the PC into RDI.
ENTRY (__vfork)
/* Pop the return PC value into RDI. We need a register that
is preserved by the syscall and that we're allowed to destroy. */
popq %rdi /* <---- here */
cfi_adjust_cfa_offset(-8)
cfi_register(%rip, %rdi)
/* Stuff the syscall number in RAX and enter into the kernel. */
movl $SYS_ify (vfork), %eax
syscall
/* Push back the return PC. */
pushq %rdi
cfi_adjust_cfa_offset(8)
There are CFI instructions on how to resolve this, but from what I can tell, the profiler doesn't support using the RDI register for resolving the return address (source).
As for MUSL, I still need to investigate if they include the CFI instructions or not. The source code itself (link) doesn't include the CFI macros, but from what I've read, they have a bunch of AWK scripts automatically producing them.