importlib_metadata icon indicating copy to clipboard operation
importlib_metadata copied to clipboard

Issues when combined with multiprocessing (MaybeEncodingError, BadZipFile, OSError)

Open 2xB opened this issue 6 months ago • 5 comments

In use with multiprocessing (e.g. when pickling awkward arrays to transmit them from a subprocess to the host), I see issues with importlib_metadata. This was originally observed by @richeldichel as multiprocessing.pool.MaybeEncodingError: Error sending result: with then appended BadZipFile error or OSError. I could get this down to the following minimal working example that sometimes (every second to every tenths try) reproduces the underlying error:

import importlib_metadata

dists = importlib_metadata.MetadataPathFinder().find_distributions()
eps = [dist.entry_points for dist in dists]

import multiprocessing

def process(i):
    importlib_metadata.entry_points()
    return

with multiprocessing.Pool(processes=8) as pool:
    for _ in pool.imap_unordered(process, range(100)):
        pass

(edit: this is a more simplified version:)

import importlib_metadata

dists = importlib_metadata.MetadataPathFinder().find_distributions()
eps = [dist.entry_points for dist in dists]

import multiprocessing

def process(i):
    dists = importlib_metadata.MetadataPathFinder().find_distributions()
    [dist._normalized_name for dist in dists]
    return

with multiprocessing.Pool(processes=8) as pool:
    for _ in pool.imap_unordered(process, range(100)):
        pass

Tested with Python 3.10.12 with importlib-metadata==8.7.0 and Python 3.8.10 with importlib-metadata==8.5.0. It has to be noted that in the latter test environment, the issue only occurs for me when doing export PYTHONPATH=/home/.../.local/lib/python3.8/site-packages/ beforehand.

Below is an exemplary error log:

Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/user/delthis/test6.py", line 9, in process
    importlib_metadata.entry_points()
  File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 1094, in entry_points
    return EntryPoints(eps).select(**params)
  File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 1091, in <genexpr>
    eps = itertools.chain.from_iterable(
  File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/_itertools.py", line 17, in unique_everseen
    k = key(element)
  File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/compat/py39.py", line 23, in normalized_name
    return dist._normalized_name
  File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 1016, in _normalized_name
    or super()._normalized_name
  File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 552, in _normalized_name
    return Prepared.normalize(self.name)
  File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 547, in name
    return md_none(self.metadata)['Name']
  File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 528, in metadata
    or self.read_text('PKG-INFO')
  File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 998, in read_text
    return self._path.joinpath(filename).read_text(encoding='utf-8')
  File "/home/user/.local/lib/python3.10/site-packages/zipp/__init__.py", line 382, in read_text
    with self.open('r', encoding, *args, **kwargs) as strm:
  File "/home/user/.local/lib/python3.10/site-packages/zipp/__init__.py", line 348, in open
    stream = self.root.open(self.at, zip_mode, pwd=pwd)
  File "/usr/lib/python3.10/zipfile.py", line 1546, in open
    raise BadZipFile("Bad magic number for file header")
zipfile.BadZipFile: Bad magic number for file header

I also saw zipfile.BadZipFile: Overlapped entries: 'EGG-INFO/PKG-INFO' (possible zip bomb) and

  File "/home/.../.local/lib/python3.8/site-packages/zipp/__init__.py", line 385, in read_text
    return strm.read()
  File "/usr/lib/python3.8/zipfile.py", line 928, in read
    buf += self._read1(self.MAX_N)
  File "/usr/lib/python3.8/zipfile.py", line 1010, in _read1
    data += self._read2(n - len(data))
  File "/usr/lib/python3.8/zipfile.py", line 1042, in _read2
    data = self._fileobj.read(n)
  File "/usr/lib/python3.8/zipfile.py", line 765, in read
    self._file.seek(self._pos)
OSError: [Errno 22] Invalid argument

2xB avatar Jun 06 '25 13:06 2xB

Looking at this in detail I learned:

  • The issue stems from #274 (since v3.8.0) as can be seen since simply removing the @functools.lru_cache() annotation fixes it, so potentially @anntzer has an idea of how to properly fix this?
  • The issue only occurs when one has .egg files in sys.path since in that case zipp.Path (zipfile.Path) objects are kept in memory indefinitely. A more detailed explanation of why this issue occurs can be found below.
  • The issue only occurs when one uses multiprocessing fork as spawn method (can be enforced with multiprocessing.set_start_method('spawn'))
  • A workaround (which should not be the final solution) is using multiprocessing.set_start_method('fork') instead - put everything that should not be executed in each subprocess (including the set_start_method) in a if __name__ == "__main__": block for that to work

More context: Due to the caching introduced in #274, references to FastPath objects are stored indefinitely. If one has .egg files in ones Python paths, the FastPath objects hold a zipfile.Path (actually zipp.Path) object created in FastPath.zip_children. As long as that object exists, it keeps a file handler to that zip file open. And that open file handler is a problem when used with multiprocessing and especially the multiprocessing spawn method fork (default in my case, can be enforced with multiprocessing.set_start_method('spawn')), compare also https://github.com/python/cpython/issues/83544 .

The open file handlers can also be observed by adding

import psutil
proc = psutil.Process()
print(proc.open_files())

after the line eps = [dist.entry_points for dist in dists] in the example above.

2xB avatar Jun 07 '25 22:06 2xB

I didn't look much in depth into the issue, but a way to fix this kind of cache+multiprocessing incompatibility is to clear the cache when starting the child process via https://docs.python.org/3/library/os.html#os.register_at_fork, see e.g. https://github.com/matplotlib/matplotlib/blob/3574a7e8f5243f93c6442e43b4c583448fc95dc6/lib/matplotlib/font_manager.py#L1584-L1590

anntzer avatar Jun 08 '25 10:06 anntzer

Thank you very much! I did not consider the existence of os.register_at_fork, that is a very simple fix. In the meantime, that also means a very simple workaround is to use

import os, importlib_metadata
os.register_at_fork(after_in_child=importlib_metadata.FastPath.__new__.cache_clear)

before the multiprocessing calls.

2xB avatar Jun 08 '25 12:06 2xB

@jaraco What is the status here? If there is anything I can do to improve this, please tell!

2xB avatar Aug 20 '25 10:08 2xB

Oh, I of course meant the status of the fixing pull request #521, not the status of this issue.

2xB avatar Aug 22 '25 14:08 2xB

Sorry for the delay in review. I'm still catching up on issues from May, but the PR caught my eye.

The register_at_fork seems like a suitable workaround. I worry it's not quite the proper fix (it feels a little bit like the problem lies in zipfile or zipp, as mentioned above). I do like that the proposed solution is essentially only activated at the appropriate scope (multiprocessing + fork).

I'll review the PR and get something merged.

I vaguely recall there may be some work to eliminate FastPath (maybe in CPython), so that may also be relevant, but I don't recall where off the top of my head. Regardless, we can merge this here and reconcile any conflicts when applying upstream.

jaraco avatar Nov 22 '25 16:11 jaraco