mitogen icon indicating copy to clipboard operation
mitogen copied to clipboard

mitogen: Prevent hung bootstrap processes, add 5 second timeout to first stage

Open moreati opened this issue 2 months ago • 1 comments

Since using select.select() in the first stage (to handle an obscure corner case where stdin appears to be non-blocking) there has been a report of first stage processes running for ever in an infinite loop - reading 0 bytes from stdin.

This attempts to do an end run around that problem by aborting if the bootstrap takes longer than a few seconds for any reason. Existing retry logic should deal with it as before.

refs #1306, #1307, #1348

moreati avatar Oct 30 '25 15:10 moreati

An experiment to test my belief about how this patch will behave. The following code is fork_it.py a simplified version of the first stage

#!/usr/bin/env python3

import os
import signal
import sys

R, W = os.pipe()

if child_pid := os.fork():
    os.dup2(0, 100)
    os.dup2(R, 0)
    os.close(R)
    os.close(W)
    sys.stdout.write(f'parent: {os.getpid()}, child: {child_pid}\n')
    os.execl(sys.executable, sys.executable)
else:
    signal.alarm(2)
    s = sys.stdin.read() 
    with os.fdopen(W, 'wb', 0) as f:
        f.write(b'print("Hello world!")\n')

The simplifications include

  • The fork child sends a hard coded preamble (b'print("Hello world!")\n')
  • The fork parent discards whatever it reads from stdin, this is intentional.
  • stdin can be safely assumed to be blocking
  • There is no compression, or stage 0.
  • There is only a single pipe, since there's no need to cache a copy of the preamble
  • For convenience the SIGALM timeout is only 2 seconds.

What I expect:

  • When no data is provided to stdin

    • the call to sys.stdin.read() blocks
    • eventually the fork child is killed by SIGALRM
    • when the child terminates the python interpreter in the parent will receive EOF and also terminate itself
    • the simulated preamble will never be received, so won't execute or print anything
    • no lingering processess should be left
  • When data is provided

    • Hello world! will be printed
    • both parent and child will exit gracefully

Results appear to match this

➜  mitogen git:(master) ✗ python3 fork_it.py        
parent: 11482, child: 11483
➜  mitogen git:(master) ✗ ps -p 11482,11483 
  PID TTY           TIME CMD
➜  mitogen git:(master) ✗ python3 fork_it.py <<< foo
parent: 11725, child: 11726
Hello world!

moreati avatar Nov 07 '25 21:11 moreati