metaflow feat: add GPU support to metaflow-dev minikube setup

Summary

Add optional GPU support for minikube in metaflow-dev with intelligent auto-detection
Provide manual control via MINIKUBE_ENABLE_GPU environment variable
Enable GPU workloads like @resources(gpu=1) in local development environments

Changes Made

Auto-detection logic: Detects NVIDIA (nvidia-smi) and AMD (rocm-smi) GPUs automatically
Environment variable control: MINIKUBE_ENABLE_GPU=auto|true|false (default: auto)
User feedback: Informative messages about GPU detection status during startup
Help documentation: Updated help text with environment variable usage
Conditional flag addition: Adds --gpus all to minikube start only when appropriate

Modes of Operation

auto (default): Automatically detects GPU availability and enables if found
true: Force enables GPU support regardless of detection
false: Explicitly disables GPU support

Test Plan

✅ Verified Makefile syntax with make help
✅ Tested dry-run with make -n setup-minikube (shows no GPU detected message)
✅ Tested forced enable with MINIKUBE_ENABLE_GPU=true (correctly adds --gpus all flag)
✅ Confirmed help text displays new environment variable documentation

Example Usage

# Auto-detect GPU (default behavior)
make setup-minikube

# Force enable GPU support
MINIKUBE_ENABLE_GPU=true make setup-minikube

# Explicitly disable GPU support  
MINIKUBE_ENABLE_GPU=false make setup-minikube

Fixes #2606

Sep 17 '25 01:09 cnaples79

Thanks for the feedback @feltech! I've updated the implementation to address the Docker compatibility concerns:

Changes Made

🔧 Improved Docker Compatibility:

Default to --devices nvidia.com/gpu=all for NVIDIA GPUs (more compatible with different Docker configurations)
Keep --gpus all for AMD/other GPUs
This addresses the NixOS Docker issue you mentioned

⚙️ Enhanced Control Options: Added MINIKUBE_GPU_FLAG environment variable for explicit control:

auto (default): Smart selection based on GPU type
gpus: Force --gpus all format
devices: Force --devices nvidia.com/gpu=all format
Custom value: User-provided flag (e.g., --devices nvidia.com/gpu=2)

Example Usage

# Auto-detect best GPU flag (default)
make setup-minikube

# Force devices format (good for Docker compatibility issues)
MINIKUBE_GPU_FLAG=devices make setup-minikube

# Force legacy gpus format
MINIKUBE_GPU_FLAG=gpus make setup-minikube

# Custom GPU specification
MINIKUBE_GPU_FLAG="--devices nvidia.com/gpu=2" make setup-minikube

This should resolve the Docker configuration compatibility issues while maintaining flexibility for different setups. Let me know if this addresses your concerns!

Sep 17 '25 12:09 cnaples79

Thanks for the clarification, and you're absolutely right — minikube doesn't support --devices. I've updated the PR to remove the --devices path and always pass --gpus all to minikube start when GPU is detected or forced via MINIKUBE_ENABLE_GPU=true.

Summary of changes:

Remove MINIKUBE_GPU_FLAG and the --devices nvidia.com/gpu=all path
Keep simple/valid --gpus all for minikube
Preserve auto‑detection and MINIKUBE_ENABLE_GPU env var controls

If you want me to also document the separate Docker CLI considerations (for folks not using minikube), I can add a short note in the devtools help.

Sep 17 '25 21:09 cnaples79