containers
containers copied to clipboard
Pillow/numpy dependency conflicts in updated images
I am using databricksruntime/standard:11.3-LTS and regularly build new images from this base image. The images are then used to start a Databricks cluster for data processing. Starting from 2 days ago, the cluster refused to execute anything with the below error:
Traceback (most recent call last):
File "/databricks/python_shell/scripts/db_ipykernel_launcher.py", line 92, in <module>
app.shell.run_line_magic('matplotlib', 'inline')
File "/databricks/python/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2407, in run_line_magic
result = fn(*args, **kwargs)
File "/databricks/python/lib/python3.9/site-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "/databricks/python/lib/python3.9/site-packages/IPython/core/magic.py", line 187, in <lambda>
call = lambda f, *a, **k: f(*a, **k)
File "/databricks/python/lib/python3.9/site-packages/IPython/core/magics/pylab.py", line 99, in matplotlib
gui, backend = self.shell.enable_matplotlib(args.gui.lower() if isinstance(args.gui, str) else args.gui)
File "/databricks/python/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3600, in enable_matplotlib
from matplotlib_inline.backend_inline import configure_inline_support
File "/databricks/python/lib/python3.9/site-packages/matplotlib_inline/__init__.py", line 1, in <module>
from . import backend_inline, config # noqa
File "/databricks/python/lib/python3.9/site-packages/matplotlib_inline/backend_inline.py", line 6, in <module>
import matplotlib
File "/databricks/python/lib/python3.9/site-packages/matplotlib/__init__.py", line 107, in <module>
from . import _api, cbook, docstring, rcsetup
File "/databricks/python/lib/python3.9/site-packages/matplotlib/rcsetup.py", line 24, in <module>
from matplotlib import _api, animation, cbook
File "/databricks/python/lib/python3.9/site-packages/matplotlib/animation.py", line 34, in <module>
from PIL import Image
File "/databricks/python/lib/python3.9/site-packages/PIL/Image.py", line 68, in <module>
from ._typing import StrOrBytesPath, TypeGuard
File "/databricks/python/lib/python3.9/site-packages/PIL/_typing.py", line 10, in <module>
NumpyArray = npt.NDArray[Any]
AttributeError: module 'numpy.typing' has no attribute 'NDArray'
Upon inspection I found out that the library pillow==10.4.0 was updated from 10.3.0, this now requires numpy>=1.21.0 (but it is still kept at 1.20.3). I could install my own numpy pinned version to override the defaults, but anyone using the vanilla image on a Databricks cluster will render the cluster unusable.
p.s. the sha for the mentioned image is 232630f809512f5831540e072fa2a5ca3096139f11bdabe6812fbb3eeb9d78bf