Failure to unwind stack in an after-fork handler
After causing a deadlock in an after-fork handler installed with pthread_atfork, an attempt to unwind that process with pystack remote --native-all is giving me
Engine error: basic_string::_S_construct null not valid
That's happening because dwfl_getthread_frames is not finding any frames, and also not setting dwfl_errno to something non-zero. Interestingly, this doesn't seem to reproduce with eu-stack, so we might be doing something wrong here that's causing this.
#101 fixes the failure mode that we get here, but we should figure out why unwinding is failing, as both gdb and eu-stack succeed.
Oof, I see what we've got wrong:
$ pystack -v remote --native-all 6611 2>&1 | egrep 'tid|thread'
INFO(process_remote): Trying to stop thread 6611
INFO(process_remote): Waiting for thread 6611 to be stopped
INFO(process_remote): Fetching Python threads
INFO(process_remote): Constructing new Python thread with tid 6610
INFO(process_remote): Detaching from thread 6611
That 6610 on the 2nd to last line is the parent process's pid/tid, not the child process's. Since we're inside of fork at this point, we seem to be finding a structure somewhere that still holds the old pid/tid, rather than the new one that we've got after fork. And then unwinding is failing because we're asking libdw to unwind a thread that doesn't exist in this process.