Issues when combined with multiprocessing (MaybeEncodingError, BadZipFile, OSError)
In use with multiprocessing (e.g. when pickling awkward arrays to transmit them from a subprocess to the host), I see issues with importlib_metadata. This was originally observed by @richeldichel as multiprocessing.pool.MaybeEncodingError: Error sending result: with then appended BadZipFile error or OSError. I could get this down to the following minimal working example that sometimes (every second to every tenths try) reproduces the underlying error:
import importlib_metadata
dists = importlib_metadata.MetadataPathFinder().find_distributions()
eps = [dist.entry_points for dist in dists]
import multiprocessing
def process(i):
importlib_metadata.entry_points()
return
with multiprocessing.Pool(processes=8) as pool:
for _ in pool.imap_unordered(process, range(100)):
pass
(edit: this is a more simplified version:)
import importlib_metadata
dists = importlib_metadata.MetadataPathFinder().find_distributions()
eps = [dist.entry_points for dist in dists]
import multiprocessing
def process(i):
dists = importlib_metadata.MetadataPathFinder().find_distributions()
[dist._normalized_name for dist in dists]
return
with multiprocessing.Pool(processes=8) as pool:
for _ in pool.imap_unordered(process, range(100)):
pass
Tested with Python 3.10.12 with importlib-metadata==8.7.0 and Python 3.8.10 with importlib-metadata==8.5.0. It has to be noted that in the latter test environment, the issue only occurs for me when doing export PYTHONPATH=/home/.../.local/lib/python3.8/site-packages/ beforehand.
Below is an exemplary error log:
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/user/delthis/test6.py", line 9, in process
importlib_metadata.entry_points()
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 1094, in entry_points
return EntryPoints(eps).select(**params)
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 1091, in <genexpr>
eps = itertools.chain.from_iterable(
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/_itertools.py", line 17, in unique_everseen
k = key(element)
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/compat/py39.py", line 23, in normalized_name
return dist._normalized_name
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 1016, in _normalized_name
or super()._normalized_name
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 552, in _normalized_name
return Prepared.normalize(self.name)
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 547, in name
return md_none(self.metadata)['Name']
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 528, in metadata
or self.read_text('PKG-INFO')
File "/home/user/.local/lib/python3.10/site-packages/importlib_metadata/__init__.py", line 998, in read_text
return self._path.joinpath(filename).read_text(encoding='utf-8')
File "/home/user/.local/lib/python3.10/site-packages/zipp/__init__.py", line 382, in read_text
with self.open('r', encoding, *args, **kwargs) as strm:
File "/home/user/.local/lib/python3.10/site-packages/zipp/__init__.py", line 348, in open
stream = self.root.open(self.at, zip_mode, pwd=pwd)
File "/usr/lib/python3.10/zipfile.py", line 1546, in open
raise BadZipFile("Bad magic number for file header")
zipfile.BadZipFile: Bad magic number for file header
I also saw zipfile.BadZipFile: Overlapped entries: 'EGG-INFO/PKG-INFO' (possible zip bomb) and
File "/home/.../.local/lib/python3.8/site-packages/zipp/__init__.py", line 385, in read_text
return strm.read()
File "/usr/lib/python3.8/zipfile.py", line 928, in read
buf += self._read1(self.MAX_N)
File "/usr/lib/python3.8/zipfile.py", line 1010, in _read1
data += self._read2(n - len(data))
File "/usr/lib/python3.8/zipfile.py", line 1042, in _read2
data = self._fileobj.read(n)
File "/usr/lib/python3.8/zipfile.py", line 765, in read
self._file.seek(self._pos)
OSError: [Errno 22] Invalid argument
Looking at this in detail I learned:
- The issue stems from #274 (since v3.8.0) as can be seen since simply removing the
@functools.lru_cache()annotation fixes it, so potentially @anntzer has an idea of how to properly fix this? - The issue only occurs when one has
.eggfiles insys.pathsince in that casezipp.Path(zipfile.Path) objects are kept in memory indefinitely. A more detailed explanation of why this issue occurs can be found below. - The issue only occurs when one uses multiprocessing
forkas spawn method (can be enforced withmultiprocessing.set_start_method('spawn')) - A workaround (which should not be the final solution) is using
multiprocessing.set_start_method('fork')instead - put everything that should not be executed in each subprocess (including theset_start_method) in aif __name__ == "__main__":block for that to work
More context: Due to the caching introduced in #274, references to FastPath objects are stored indefinitely. If one has .egg files in ones Python paths, the FastPath objects hold a zipfile.Path (actually zipp.Path) object created in FastPath.zip_children. As long as that object exists, it keeps a file handler to that zip file open. And that open file handler is a problem when used with multiprocessing and especially the multiprocessing spawn method fork (default in my case, can be enforced with multiprocessing.set_start_method('spawn')), compare also https://github.com/python/cpython/issues/83544 .
The open file handlers can also be observed by adding
import psutil
proc = psutil.Process()
print(proc.open_files())
after the line eps = [dist.entry_points for dist in dists] in the example above.
I didn't look much in depth into the issue, but a way to fix this kind of cache+multiprocessing incompatibility is to clear the cache when starting the child process via https://docs.python.org/3/library/os.html#os.register_at_fork, see e.g. https://github.com/matplotlib/matplotlib/blob/3574a7e8f5243f93c6442e43b4c583448fc95dc6/lib/matplotlib/font_manager.py#L1584-L1590
Thank you very much! I did not consider the existence of os.register_at_fork, that is a very simple fix. In the meantime, that also means a very simple workaround is to use
import os, importlib_metadata
os.register_at_fork(after_in_child=importlib_metadata.FastPath.__new__.cache_clear)
before the multiprocessing calls.
@jaraco What is the status here? If there is anything I can do to improve this, please tell!
Oh, I of course meant the status of the fixing pull request #521, not the status of this issue.
Sorry for the delay in review. I'm still catching up on issues from May, but the PR caught my eye.
The register_at_fork seems like a suitable workaround. I worry it's not quite the proper fix (it feels a little bit like the problem lies in zipfile or zipp, as mentioned above). I do like that the proposed solution is essentially only activated at the appropriate scope (multiprocessing + fork).
I'll review the PR and get something merged.
I vaguely recall there may be some work to eliminate FastPath (maybe in CPython), so that may also be relevant, but I don't recall where off the top of my head. Regardless, we can merge this here and reconcile any conflicts when applying upstream.