buildah icon indicating copy to clipboard operation
buildah copied to clipboard

Failed to discover NVIDIA GPU in the running container started by buildah (vfs + chroot)

Open enihcam opened this issue 1 year ago • 16 comments

Description Failed to discover NVIDIA GPU in the running container started by buildah (vfs + chroot)

Steps to reproduce the issue:

  1. start a GPU container that does NOT support Docker-in-Docker (for security reasons)
  2. install buildah
  3. configure storage driver export STORAGE_DRIVER=vfs and isolation export BUILDAH_ISOLATION=chroot
  4. build a PyTorch+CUDA image with buildah and run with buildah

Describe the results you received: image

Describe the results you expected: pytorch finds the gpu run the code successfully.

Output of rpm -q buildah or apt list buildah:

# rpm -q buildah
buildah-1.30.0-1.tl4.x86_64

Output of buildah version:

# buildah version
Version:         1.30.0
Go Version:      go1.19
Image Spec:      1.0.2-dev
Runtime Spec:    1.1.0-rc.1
CNI Spec:        1.0.0
libcni Version:  v1.1.2
image Version:   5.25.0
Git Commit:
Built:           Fri Jul 14 19:36:27 2023
OS/Arch:         linux/amd64
BuildPlatform:   linux/amd64

Output of podman version if reporting a podman build issue:

(paste your output here)

Output of cat /etc/*release:

# cat /etc/*release
NAME="TencentOS Server"
VERSION="4.0"
ID="tencentos"
ID_LIKE="tencentos"
VERSION_ID="4.0"
PLATFORM_ID="platform:tl4.0"
PRETTY_NAME="TencentOS Server 4.0"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:tencentos:tencentos:4.0"
HOME_URL="https://cloud.tencent.com/product/ts"
BUG_REPORT_URL="https://cloud.tencent.com/product/ts"
TencentOS Server 4.0

Output of uname -a:

# uname -a
Linux root-pvkf3ma0a 5.4.119-19.0009.28 #1 SMP Thu May 18 10:37:10 CST 2023 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

# cat /etc/containers/storage.conf
[storage]
driver = "vfs"
runroot = "/data/containers/storage"
graphroot = "/data/containers/storage"
rootless_storage_path = "/data/containers/storage"

[storage.options.vfs]
ignore_chown_errors = "true"

enihcam avatar Dec 16 '23 07:12 enihcam

Isn't the GPU a device? Say /dev/gpu?

Could you try

ctr=$(buildah from --device /dev/gpu ...) buildah run $ctr ...

rhatdan avatar Dec 16 '23 10:12 rhatdan

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Jan 16 '24 00:01 github-actions[bot]