slim icon indicating copy to clipboard operation
slim copied to clipboard

Incomplete Image Nvidia/PyTorch

Open justusschock opened this issue 2 years ago • 2 comments

Expected Behavior

A fully working image (just as the original)


Actual Behavior

The size reduced nicely, but it's not working.

-> docker run nvcr.io/nvidia/pytorch.slim
/etc/shinit_v2: line 42: dpkg: command not found
/opt/nvidia/nvidia_entrypoint.sh: line 30: readlink: command not found
/opt/nvidia/nvidia_entrypoint.sh: line 30: dirname: command not found

/etc/shinit_v2: line 42: dpkg: command not found

Steps to Reproduce the Problem

  1. docker pull nvcr.io/nvidia/pytorch:21.10-py3
  2. docker-slim build --target nvcr.io/nvidia/pytorch:21.10-py3 --http-probe-off
  3. docker run nvcr.io/nvidia/pytorch.slim

Specifications

  • Version: 1.37.5
  • Platform: Linux (Ubuntu)

justusschock avatar Mar 29 '22 12:03 justusschock

Similar thing happened in my case with a self built Tensorflow image with CUDA / CuDNN + RAPIDS The image is really big, so it should be a perfect candidate for trimming down.

1st run - docker-slim build --target myimage --http-probe-off cmd=build info=results size.optimized='3.7 MB' status='MINIFIED' by='5337.08X' size.original='20 GB' The build is useless, as practically everything is stripped out

2nd run - docker-slim build --target myimage --include-path /usr --http-probe-off cmd=build info=results status='MINIFIED' by='1.15X' size.original='20 GB' size.optimized='17 GB' Asked the build specifically to let the libraries inside of the container. But even this smaller trimming broke the otherwise working container.

$ docker run --rm -it myimage.slim bash
bash-4.2# python3
Python 3.8.9 (default, Apr  7 2022, 12:44:12)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cuml
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/site-packages/cuml-22.4.0-py3.8-linux-x86_64.egg/cuml/__init__.py", line 17, in <module>
    from cuml.common.base import Base
  File "/usr/local/lib/python3.8/site-packages/cuml-22.4.0-py3.8-linux-x86_64.egg/cuml/common/__init__.py", line 17, in <module>
    from cuml.common.array import CumlArray
  File "/usr/local/lib/python3.8/site-packages/cuml-22.4.0-py3.8-linux-x86_64.egg/cuml/common/array.py", line 25, in <module>
    from cudf import DataFrame
  File "/usr/local/lib/python3.8/site-packages/cudf-22.4.0-py3.8-linux-x86_64.egg/cudf/__init__.py", line 5, in <module>
    validate_setup()
  File "/usr/local/lib/python3.8/site-packages/cudf-22.4.0-py3.8-linux-x86_64.egg/cudf/utils/gpu_utils.py", line 89, in validate_setup
    cuda_runtime_version = runtimeGetVersion()
  File "/usr/local/lib/python3.8/site-packages/rmm/_cuda/gpu.py", line 87, in runtimeGetVersion
    major, minor = numba.cuda.runtime.get_version()
  File "/usr/local/lib/python3.8/site-packages/numba/cuda/cudadrv/runtime.py", line 111, in get_version
    self.cudaRuntimeGetVersion(ctypes.byref(rtver))
  File "/usr/local/lib/python3.8/site-packages/numba/cuda/cudadrv/runtime.py", line 65, in __getattr__
    self._initialize()
  File "/usr/local/lib/python3.8/site-packages/numba/cuda/cudadrv/runtime.py", line 51, in _initialize
    self.lib = open_cudalib('cudart')
  File "/usr/local/lib/python3.8/site-packages/numba/cuda/cudadrv/libs.py", line 59, in open_cudalib
    path = get_cudalib(lib)
  File "/usr/local/lib/python3.8/site-packages/numba/cuda/cudadrv/libs.py", line 51, in get_cudalib
    libdir = get_cuda_paths()['cudalib_dir'].info
  File "/usr/local/lib/python3.8/site-packages/numba/cuda/cuda_paths.py", line 158, in get_cuda_paths
    'nvvm': _get_nvvm_path(),
  File "/usr/local/lib/python3.8/site-packages/numba/cuda/cuda_paths.py", line 136, in _get_nvvm_path
    candidates = find_lib('nvvm', path)
  File "/usr/local/lib/python3.8/site-packages/numba/misc/findlib.py", line 44, in find_lib
    return find_file(regex, libdir)
  File "/usr/local/lib/python3.8/site-packages/numba/misc/findlib.py", line 56, in find_file
    entries = os.listdir(ldir)
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/cuda/nvvm/lib64'

Strangely it complains about missing files under the /usr path, which was explicitly included.

Atharex avatar May 12 '22 05:05 Atharex

Hello @kcq are there any updates?

Maxfashko avatar Oct 24 '22 14:10 Maxfashko