pyinstaller icon indicating copy to clipboard operation
pyinstaller copied to clipboard

Rework detection of inherited PyInstaller environment

Open rokm opened this issue 1 year ago • 0 comments

This PR reworks the logic we have in place for handling subprocesses spawned from the frozen application that involve a PyInstaller-frozen executable - either the same one (e.g., multiprocessing worker spawned via sys.executable), or a different one. As described below, it also changes the assumed default, leading to a breaking change.

Motivation

Up until now, the logic was based on presence/absence of the _MEIPASS2 environment variable. If the variable is not defined, we are a parent process of the onefile application (or a onedir application, although there, this does not matter so much). If it is defined (i.e., set by the parent onefile process once it unpacks the application), we are a child process of onefile application.

But because the user's python code might spawn a subprocess using a different onefile frozen application, we must ensure that this other application's process is not tricked into thinking that it is a child process (as it must unpack its own files into its own temporary directory, then run its child process). This was solved by having the bootloader unset the `_MEIPASS2 environment variable after reading the path from it.

This means that default assumption of the bootloader is that it is running a top-level process of a new instance of a frozen application, unless the presence of _MEIPASS2 environment variable suggests otherwise.

This has important implications for pretty much every piece of code that spawns a worker subprocess using sys.executable. If the _MEIPASS2 environment variable is not restored prior to spawning the subprocess, that subprocess will become a new instance of a frozen application (i.e., a onefile executable will unpack itself again into separate temporary directory, then run its child process).

Therefore, our run-time hook for multiprocessing overrides Popen classes and adds a mix-in that temporarily restores _MEIPASS2 by setting and unsetting the environment variable (and because this affects the state of the current process, it requires additional locking to ensure thread safety...).

This approach does not scale well; it requires every piece of user or 3rd party code that tries to spawn a worker subprocess via sys.executable to restore _MEIPASS2 before doing so.

Sidenote: as it turns out, even _MEIPASS2 restoration that we have in our multiprocessing run-time hook is not quite 100% - it works for the worker processes, but at least on linux with spawn start method, it does not seem to be restored for the resource tracker process; which means onefile application unpacks itself (once) again and shows splash screen. Which I suspect is what OP in #7121 was describing.

The change

The main idea here is to rever the default assumption mentioned before; we now assume that we are running a worker subprocess (spawned via sys.executable), unless the lack of PyInstaller-set environment or the values in it suggest otherwise.

Namely, we now track the path to the application's PKG/CArchive (in most cases, the path to the executable); if environment variable with this path is not set, we are the top-level process. If environment variable is set and matches the archive path in the current process, we are a worker sub-process. And if environment variable is set but does not match the archive path in the current process, then we are top-level process of a different frozen application.

To prevent theoretical issues of spawning a process with older PyInstaller-made executable, the environment variables have been renamed. The archive path is tracked in _PYI_ARCHIVE_FILE, while the temporary directory path is passed via _PYI_APPLICATION_HOME_DIR (we cannot use _MEIPASS2 anymore because old frozen applications assume this is unset by default, and we are effectively changing this behavior).

So this approach allows us to seamlessly handle worker sub-processes spawned via sys.executable and spawning different frozen application.

On the flip side, it breaks the restart of application as discussed in #6163; i.e., by spawning a process using sys.executable, and then exiting the current process. Formally, I think the best description for this is spawning sub-processes using the same executable and expecting them to outlive the current process (or perhaps less charitably, spawning orphaned children).

In onefile mode, such child process will now try to reuse the same temporary files that the main application process is using, but once the main application process exits, its parent will initiate the cleanup. Hence such orphaned process will end up with missing files (and/or might interfere with the cleanup).

So in this scenario, the application is now required to signal to bootloader to reset the PyInstaller environment, by setting PYINSTALLER_RESET_ENVIRONMENT environment variable (to 1); alternatively, it could also unset _PYI_ARCHIVE_FILE, but I think we should give users an "official" / "public" environment variable.

Changes to splash screen

As part of above changes, we now also have explicit tracking of (sub-)process level, which allows us to distinguish between the main application process of onefile application, and worker sub-process spawned from it via sys.executable (which up until now were both considered as just "children of parent/launcher process").

This, in turn, allows us to implement automatic suppression of splash screen in worker sub-processes spawned from the main application process. This now applies to both onedir and onefile, and is done automatically. (Up until now, onefile did not show splash screen in such processes if _MEIPASS2 was restored, while onedir did still show it -might have shown it still).

The pyi_splash module should now gracefully handle such suppression; its functions become no-op, and do so quietly - they should neither raise errors nor print warnings (e.g., about not being to connect to splash screen).

It is now also possible for user to explicitly suppress the splash screen in the main part of the application, by setting PYINSTALLER_SUPPRESS_SPLASH_SCREEN environment variable (to 1).

Closes #7089 - splash screen can now be suppressed by user via environment variable.

Closes #7121 - both multiprocessing and gooey part of that issue should be fixed now.

rokm avatar Jun 29 '24 14:06 rokm