Lazy load matplotlib to avoid import locks
On a SLURM cluster I've seen some hangs due to igraph loading matplotlib
TimeoutError: Lock error: Matplotlib failed to acquire the following lock file:
/dev/shm/.cache-sdash/matplotlib/fontlist-v330.json.matplotlib-lock
This maybe due to another process holding this lock file. If you are sure no
other Matplotlib process is running, remove this file and try again.
with cbook._lock_path(filename), open(filename, 'w') as fh:
File "/mnt/sw/nix/store/gpkc8q6zjnp3n3h3w9hbmbj6gjbxs85w-python-3.10.10-view/lib/python3.10/contextlib.py", line 135, in __enter__
raise TimeoutError("""\
Unfortunately, it is not under my power to uninstall matplotlib because it is installed in some shared space. Ideally, igraph would load it lazily only if needed.
Thanks for reporting. This is a site installation issue so there is a fundamental problem out of our control. Nonetheless, you could:
- use a per-user Python interpreter + packages
- use PYTHONPATH just for matplotlib
- mock it
We could also lazy load like you suggested. I'm really busy these days, do you think you could open a PR? I'd be happy to assist
The error message that you see happens when multiple processes are using Matplotlib and they all want to write to ~/.matplotlib/fontlist-v330.json at the same time. This file is built the first time Matplotlib is invoked on the system, so you can work around the problem by pre-populating the Matplotlib font cache. IMHO lazy-loading Matplotlib wouldn't solve the problem as it could still happen that two processes would want to lazy-load Matplotlib at the same time.
Another thing that you could try is to add a random delay to the startup of your process to decrease the chance of two copies of your project trying to import matplotlib at the same time.
I had a similar issue with a different software that also created certain files on first start. I worked around it by logging in to the nodes of the HPC cluster where my code would run, and starting up a single instance of this software manually, so that the file could be created correctly.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.