otel-profiling-agent icon indicating copy to clipboard operation
otel-profiling-agent copied to clipboard

Profiler doesn't report full call stack if vfork is being used

Open patrickpichler opened this issue 2 months ago • 0 comments

We noticed that the profiler fails to report the full call stack, if the vfork LIBC functions is being used. The issue appears on x86_64, but not aarch64 (I only tested these two).

I did a bit of an investigation (link) and come up with a small sample C program to trigger the issue:

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <fcntl.h>

void run();

int main() {
    run();
    return 0;
}

void run() {
    pid_t pid;
    int status;

    printf("Run\n");

    while(true) {
      usleep(1000*100);
      pid = vfork();

      if (pid == -1) {
          perror("vfork failed");
          return;
      } else if (pid == 0) {
          int dev_null = open("/dev/null", O_WRONLY);
          dup2(dev_null, 1);
          close(dev_null);

          execlp("/bin/id", "id", NULL);
      } else {
          waitpid(pid, &status, 0);
          printf("Child process finished.\n");
      }
    }
}

Using https://github.com/danielpacak/opentelemetry-lazybackend/ here are two example stacktraces that I observed. First the one looking good:

------------------- New Sample --------------------        
  Timestamp[0]: 1761136876434313775 (2025-10-22 12:41:16.434313775 +0000 UTC)                                                                  
  thread.name: dummy                                                                                                                           
  process.executable.name: dummy                   
  process.executable.path: /home/laborant/sample/dummy                                                                                         
  process.pid: 9685                                                                                                                            
  thread.id: 9685                                                                                                                              
---------------------------------------------------                                                                                            
Instrumentation: kernel, Function: queue_work_on, File: , Line: 0, Column: 0                                                                   
Instrumentation: kernel, Function: tty_insert_flip_string_and_push_buffer, File: , Line: 0, Column: 0                                          
Instrumentation: kernel, Function: pty_write, File: , Line: 0, Column: 0                                                                       
Instrumentation: kernel, Function: process_output_block, File: , Line: 0, Column: 0                                                            
Instrumentation: kernel, Function: n_tty_write, File: , Line: 0, Column: 0                                                                     
Instrumentation: kernel, Function: do_tty_write, File: , Line: 0, Column: 0                                                                    
Instrumentation: kernel, Function: file_tty_write.constprop.0, File: , Line: 0, Column: 0                                                      
Instrumentation: kernel, Function: tty_write, File: , Line: 0, Column: 0                                                                       
Instrumentation: kernel, Function: new_sync_write, File: , Line: 0, Column: 0                                                                  
Instrumentation: kernel, Function: vfs_write, File: , Line: 0, Column: 0                                                                       
Instrumentation: kernel, Function: ksys_write, File: , Line: 0, Column: 0                                                                      
Instrumentation: kernel, Function: __x64_sys_write, File: , Line: 0, Column: 0                                                                 
Instrumentation: kernel, Function: do_syscall_64, File: , Line: 0, Column: 0                                                                   
Instrumentation: kernel, Function: entry_SYSCALL_64_after_hwframe, File: , Line: 0, Column: 0                                                  
Instrumentation: native: Function: 0x931cd, File: libc.so.6                                                                                    
Instrumentation: native: Function: 0x931f3, File: libc.so.6                                                                                    
Instrumentation: native: Function: 0x10e53d, File: libc.so.6                                                                                   
Instrumentation: native: Function: 0x8eb74, File: libc.so.6                                                                                    
Instrumentation: native: Function: 0x8c9b5, File: libc.so.6                                                                                    
Instrumentation: native: Function: 0x8dae0, File: libc.so.6                                                                                    
Instrumentation: native: Function: 0x8dfff, File: libc.so.6                                                                                    
Instrumentation: native: Function: 0x82e79, File: libc.so.6                                                                                    
Instrumentation: native: Function: 0x12c6, File: dummy                                                                                         
Instrumentation: native: Function: 0x11d1, File: dummy                                                                                         
Instrumentation: native: Function: 0x27674, File: libc.so.6                                                                                    
Instrumentation: native: Function: 0x27728, File: libc.so.6                                                                                    
Instrumentation: native: Function: 0x10f4, File: dummy

And here one cut after vfork:

------------------- New Sample --------------------                                                                                            
  Timestamp[0]: 1761136872834323100 (2025-10-22 12:41:12.8343231 +0000 UTC)                                                                    
  thread.name: dummy                                                                                                                           
  process.executable.name: dummy                   
  process.executable.path: /home/laborant/sample/dummy                                                                                         
  process.pid: 9685                                                                                                                            
  thread.id: 9685                                                                                                                              
---------------------------------------------------                                                                                            
Instrumentation: kernel, Function: selinux_cred_prepare, File: , Line: 0, Column: 0                                                            
Instrumentation: kernel, Function: prepare_creds, File: , Line: 0, Column: 0                                      
Instrumentation: kernel, Function: copy_creds, File: , Line: 0, Column: 0                                            
Instrumentation: kernel, Function: copy_process, File: , Line: 0, Column: 0      
Instrumentation: kernel, Function: kernel_clone, File: , Line: 0, Column: 0                  
Instrumentation: kernel, Function: __do_sys_vfork, File: , Line: 0, Column: 0                                        
Instrumentation: kernel, Function: do_syscall_64, File: , Line: 0, Column: 0                                        
Instrumentation: kernel, Function: entry_SYSCALL_64_after_hwframe, File: , Line: 0, Column: 0                        
Instrumentation: native: Function: 0x103d3b, File: libc.so.6

From what I can tell, the root cause is this line in GLIBC code, that pops the PC into RDI.

ENTRY (__vfork)

	/* Pop the return PC value into RDI.  We need a register that
	   is preserved by the syscall and that we're allowed to destroy. */
	popq	%rdi /* <---- here */
	cfi_adjust_cfa_offset(-8)
	cfi_register(%rip, %rdi)

	/* Stuff the syscall number in RAX and enter into the kernel.  */
	movl	$SYS_ify (vfork), %eax
	syscall

	/* Push back the return PC.  */
	pushq	%rdi
	cfi_adjust_cfa_offset(8)

There are CFI instructions on how to resolve this, but from what I can tell, the profiler doesn't support using the RDI register for resolving the return address (source).

As for MUSL, I still need to investigate if they include the CFI instructions or not. The source code itself (link) doesn't include the CFI macros, but from what I've read, they have a bunch of AWK scripts automatically producing them.

patrickpichler avatar Oct 22 '25 13:10 patrickpichler