[FEATURE] GPU device passthrough in sandbox mode

Open benvanik opened this issue 1 month ago • 2 comments

Preflight Checklist

[x] I have searched existing requests and this feature hasn't been requested yet
[x] This is a single feature request (not multiple features)

Problem Statement

Sandbox mode blocks GPU access because /dev/dri and /dev/kfd (etc) aren't passed through. This breaks Vulkan, ROCm/HIP, and CUDA workflows when the sandbox is enabled.

Step 1: Configure sandbox for filesystem protection

{
  "sandbox": { "enabled": true, "autoAllowBashIfSandboxed": true },
  "permissions": { "allow": ["Bash"] }
}

Step 2: Ask Claude to verify GPU access Check if the GPU is available and run a simple PyTorch test

Step 3: Claude runs diagnostics:

# AMD GPU
rocm-smi --showid
hipconfig --version

# NVIDIA GPU
nvidia-smi

# Vulkan (any GPU)
vulkaninfo --summary

Step 4: Today, all fail - devices don't exist in sandbox:

$ rocm-smi --showid
No AMD GPUs found

$ vulkaninfo --summary
GPU0: llvmpipe (LLVM 20.1.8, 256 bits)    # CPU fallback only
      deviceType = PHYSICAL_DEVICE_TYPE_CPU

$ ls /dev/dri /dev/kfd
ls: cannot access '/dev/dri': No such file or directory
ls: cannot access '/dev/kfd': No such file or directory

Expected: GPU device nodes passed through, hardware detected.

Proposed Solution

Add sandbox.devices setting, e.g. for AMD:

{
  "sandbox": {
    "enabled": true,
    "devices": ["/dev/dri", "/dev/kfd"]
  }
}

Or for NVIDIA:

{
  "sandbox": {
    "enabled": true,
    "devices": ["/dev/dri", "/dev/nvidia0", "/dev/nvidiactl", "/dev/nvidia-uvm"]
  }
}

Claude Code's sandbox already uses https://github.com/containers/bubblewrap on Linux. Bwrap natively supports device passthrough via https://www.mankier.com/1/bwrap:

  --dev-bind-try SRC DEST   Bind mount host path SRC on DEST, allowing device access.
                            Ignores non-existent SRC (graceful on systems without GPUs).

So the setting above would add to the bwrap invocation:

  --dev-bind-try /dev/dri /dev/dri \
  --dev-bind-try /dev/kfd /dev/kfd

  --dev-bind-try /dev/nvidia0 /dev/nvidia0 \
  --dev-bind-try /dev/nvidiactl /dev/nvidiactl \
  --dev-bind-try /dev/nvidia-uvm /dev/nvidia-uvm

Alternative Solutions

The only way around this I've found is to disable sandboxing, which is unfortunate. I'd love to use sandboxing instead of needing to run things through docker (where I can then do GPU passthrough but with much more pain).

Priority

Critical - Blocking my work

Feature Category

Configuration and settings

Use Case Example

My primary use is developing GPU-accelerated code with Claude Code in sandbox mode. I want to automate kernel authoring, fine-tuning/optimization, and correctness testing as well as run my existing test suites (pytorch CUDA/ROCM, ONNX, vLLM, etc).

There's a few other things we use as well (namely ffmpeg as part of image pipelines), but there's quite a bit of popular GPU-accelerated tools that cannot run properly in sandbox mode which would be generally useful:

Software	Use Case	Blocked By
PyTorch	ML training/inference	/dev/kfd, /dev/nvidia*
TensorFlow	ML training/inference	/dev/kfd, /dev/nvidia*
JAX	ML research	/dev/kfd, /dev/nvidia*
IREE	ML compiler runtime	/dev/kfd, /dev/dri
Blender	3D rendering (Cycles)	/dev/dri, /dev/nvidia*
FFmpeg	Video encoding (VAAPI/NVENC)	/dev/dri, /dev/nvidia*
darktable/RawTherapee	Photo processing (OpenCL)	/dev/dri

Concretely, I want to enable the sandbox and then have Claude be able to run directly or indirectly via scripts/tools commands like:

python3 -c "import torch; print(torch.cuda.is_available())"      # NVIDIA
python3 -c "import torch; print(torch.hip.is_available())"       # AMD

Additional Context

Prior Art

Flatpak uses identical bwrap flags for GPU-accelerated sandboxed apps:

https://docs.flatpak.org/en/latest/sandbox-permissions.html grants access to /dev/dri for OpenGL/Vulkan
Flatpak https://github.com/flatpak/flatpak/issues/3330 including /dev/dri, /dev/nvidia*, /dev/mali when DRI access is requested
This is standard practice for thousands of sandboxed Linux desktop apps

Common GPU device nodes:

Device	Purpose
/dev/dri/card*	DRM display devices
/dev/dri/renderD*	GPU compute/render (Vulkan, OpenGL)
/dev/kfd	AMD ROCm/HIP kernel driver
/dev/nvidia*	NVIDIA CUDA/driver

Technical Considerations

Security: Device access is read/write to GPU hardware only - no filesystem escape. Flatpak considers --device=dri a https://docs.flatpak.org/en/latest/sandbox-permissions.html safe for general use.
Ordering: bwrap processes args in order. Device binds should come before --dev /dev to avoid shadowing (https://github.com/containers/bubblewrap/issues/248).
PCI sysfs: Some GPU tools also need read-only access to /sys/bus/pci and /sys/devices/pci* for device enumeration (https://wiki.alpinelinux.org/wiki/Bubblewrap/Examples).
Graceful degradation: --dev-bind-try (not --dev-bind) ensures systems without GPUs don't error.

References

https://www.mankier.com/1/bwrap - --dev-bind, --dev-bind-try documentation
https://docs.flatpak.org/en/latest/sandbox-permissions.html - --device=dri for GPU access
https://wiki.archlinux.org/title/Bubblewrap/Examples - GPU passthrough patterns
https://wiki.alpinelinux.org/wiki/Bubblewrap/Examples - mpv GPU example with renderD128
https://developer.nvidia.com/blog/improving-cuda-initialization-times-using-cgroups-in-certain-scenarios/ - bwrap with NVIDIA devices

Test Cases

These commands should work in sandbox mode with GPU passthrough:

# AMD ROCm/HIP
rocm-smi --showid                    # List AMD GPUs
rocminfo                             # ROCm device info
hipconfig --version                  # HIP configuration
clinfo                               # OpenCL devices

# NVIDIA CUDA
nvidia-smi                           # List NVIDIA GPUs
nvcc --version                       # CUDA compiler

# Vulkan (vendor-agnostic)
vulkaninfo --summary                 # Should show real GPU, not llvmpipe
vkcube                               # Vulkan test cube

# OpenGL
glxinfo | grep "OpenGL renderer"     # Should show GPU, not llvmpipe

# Simple compute test
python3 -c "import torch; print(torch.cuda.is_available())"      # NVIDIA
python3 -c "import torch; print(torch.hip.is_available())"       # AMD

Dec 05 '25 00:12 benvanik

I did end up with a workaround, but this would be a really good feature to support natively so trickery is not required.

Ensure ~/.local/bin/ is on your PATH before /usr/bin
Make ~/.local/bin/bwrap:

#!/bin/bash
# GPU-aware bwrap wrapper for Claude Code sandbox.
#
# Claude Code's sandbox uses bubblewrap but doesn't pass through GPU devices.
# This wrapper injects --dev-bind-try flags for GPU access.
#
# CRITICAL: Device binds must come AFTER --dev /dev to avoid being shadowed
# by the fresh devtmpfs mount. This wrapper parses args and injects at the
# correct position.
#
# Devices passed through:
#   /dev/dri/*     - DRM/Vulkan/OpenGL (all GPUs)
#   /dev/kfd       - AMD ROCm/HIP kernel driver
#   /dev/nvidia*   - NVIDIA CUDA driver
#
# --dev-bind-try gracefully skips missing devices (no error on non-GPU systems).
#
# See: https://github.com/anthropics/claude-code/issues/13108

set -euo pipefail

REAL_BWRAP=/usr/bin/bwrap

# GPU devices to pass through.
GPU_DEVICES=(
    /dev/dri
    /dev/kfd
    /dev/nvidia0
    /dev/nvidiactl
    /dev/nvidia-uvm
    /dev/nvidia-uvm-tools
    /dev/nvidia-modeset
)

# Build new argument list, inserting GPU binds after --dev.
args=()
inject_next=false

for arg in "$@"; do
    args+=("$arg")

    if [[ "$inject_next" == true ]]; then
        # Just saw --dev and now its DEST arg; inject GPU devices after.
        for dev in "${GPU_DEVICES[@]}"; do
            args+=(--dev-bind-try "$dev" "$dev")
        done
        inject_next=false
    fi

    if [[ "$arg" == "--dev" ]]; then
        inject_next=true
    fi
done

exec "$REAL_BWRAP" "${args[@]}"

Relaunch claude

Hope that helps someone trying to teach claude to do GPU stuff :)

Dec 05 '25 01:12 benvanik