containers icon indicating copy to clipboard operation
containers copied to clipboard

Pillow/numpy dependency conflicts in updated images

Open kamtingtsoi opened this issue 1 year ago • 2 comments

I am using databricksruntime/standard:11.3-LTS and regularly build new images from this base image. The images are then used to start a Databricks cluster for data processing. Starting from 2 days ago, the cluster refused to execute anything with the below error:

Traceback (most recent call last):
  File "/databricks/python_shell/scripts/db_ipykernel_launcher.py", line 92, in <module>
    app.shell.run_line_magic('matplotlib', 'inline')
  File "/databricks/python/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2407, in run_line_magic
    result = fn(*args, **kwargs)
  File "/databricks/python/lib/python3.9/site-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/databricks/python/lib/python3.9/site-packages/IPython/core/magic.py", line 187, in <lambda>
    call = lambda f, *a, **k: f(*a, **k)
  File "/databricks/python/lib/python3.9/site-packages/IPython/core/magics/pylab.py", line 99, in matplotlib
    gui, backend = self.shell.enable_matplotlib(args.gui.lower() if isinstance(args.gui, str) else args.gui)
  File "/databricks/python/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3600, in enable_matplotlib
    from matplotlib_inline.backend_inline import configure_inline_support
  File "/databricks/python/lib/python3.9/site-packages/matplotlib_inline/__init__.py", line 1, in <module>
    from . import backend_inline, config  # noqa
  File "/databricks/python/lib/python3.9/site-packages/matplotlib_inline/backend_inline.py", line 6, in <module>
    import matplotlib
  File "/databricks/python/lib/python3.9/site-packages/matplotlib/__init__.py", line 107, in <module>
    from . import _api, cbook, docstring, rcsetup
  File "/databricks/python/lib/python3.9/site-packages/matplotlib/rcsetup.py", line 24, in <module>
    from matplotlib import _api, animation, cbook
  File "/databricks/python/lib/python3.9/site-packages/matplotlib/animation.py", line 34, in <module>
    from PIL import Image
  File "/databricks/python/lib/python3.9/site-packages/PIL/Image.py", line 68, in <module>
    from ._typing import StrOrBytesPath, TypeGuard
  File "/databricks/python/lib/python3.9/site-packages/PIL/_typing.py", line 10, in <module>
    NumpyArray = npt.NDArray[Any]
AttributeError: module 'numpy.typing' has no attribute 'NDArray'

Upon inspection I found out that the library pillow==10.4.0 was updated from 10.3.0, this now requires numpy>=1.21.0 (but it is still kept at 1.20.3). I could install my own numpy pinned version to override the defaults, but anyone using the vanilla image on a Databricks cluster will render the cluster unusable.

p.s. the sha for the mentioned image is 232630f809512f5831540e072fa2a5ca3096139f11bdabe6812fbb3eeb9d78bf

kamtingtsoi avatar Jul 23 '24 14:07 kamtingtsoi