nvidia-docker
nvidia-docker copied to clipboard
MPS Support
Hi,
When I use "CUDA Multi-Process Service" aka MPS in nvidia-docker environment, I met a couple of issues. So I'm wonder if MPS is supported in nvidia-docker? Please help me, thanks in advance~
Here is problems I have met:
- When I run
nvidia-cuda-mps-control -d
to start mps daemon in Nvidia-docker, I can't see this process fromnvidia-smi
, however, I can see this process from host machine. In comparison, when I run the same command,nvidia-cuda-mps-control -d
, in Host machine (physical server), I got see this from nvidia-smi. (need run a gpu program first to start MPS server) - I tried to run caffe training with MPS as a example, 2 training process at the same time in Nvidia-docker env. It showed:
F0703 13:39:15.539633 97 common.cpp:165] Check failed: error == cudaSuccess (46 vs. 0) all CUDA-capable devices are busy or unavailable
In comparison, this works ok in host (physical machine).
I'm trying this on P100 GPU, Ubuntu14,
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61
Docker version 17.04.0-ce, build 4845c56
I hope this is the right place to ask, thanks again.
Short answer, it is not supported for now. However, we are looking at it for the 2.0 timeframe but there are a lot of corner cases that need to be investigated.
I'll update this issue with additional information once we are confident it could work properly.
Hi, Is the 2.0 supports the Cuda9 for Volta MPS now? @3XX0 ,thanks.
This MPS Support seems like it would be a blocker creating the service deployments in orchestration. I'll be following the outcome in anticipation for a pull request use-case for the swarm or Kubernetes functionality.
@
Any progress? or is there any workaround so I can use CUDA Multi-Process Service in the container?
Shouldn't it be the other way around? I.E. The MPS should run on the host so it can allocate process time to multiple containers? Is that an already supported architecture?
With 2.0 it should work as long as you run the MPS server on the host and use --ipc=host
. We're working torward a better integration though, so I'll keep this issue open.
# Launch two containers on the second GPU device
sudo CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=1 nvidia-cuda-mps-control -d
docker run -ti --rm -e NVIDIA_VISIBLE_DEVICES=1 --runtime=nvidia --ipc=host nvidia/cuda
docker run -ti --rm -e NVIDIA_VISIBLE_DEVICES=1 --runtime=nvidia --ipc=host nvidia/cuda
echo quit | sudo nvidia-cuda-mps-control
@3XX0,
Does it mean that we can set and limit CUDA_MPS_ACTIVE_THREAD_PERCENTAGE for each container? Any examples of usage would really help.
Could you please elaborate what you mean by "better integration"?
Thank you
mark
@3XX0 How much does "-ipc=host" compromise security? Somebody asked the question on SO but no answer yet: https://stackoverflow.com/questions/38907708/docker-ipc-host-and-security
@3XX0 Any update on when nvidia-docker will officially support MPS?
@3XX0 I did some tests and --ipc=host does appear to work. But is there anything else we should pay attention to run current nvidia-docker 2 under MPS? Would you recommend to use it in production? Would be super helpful if you can provide some guidance here.
I've added a wiki page on how to use MPS with Docker Compose: https://github.com/NVIDIA/nvidia-docker/wiki/MPS-(EXPERIMENTAL)
You can look at the docker-compose.yml
file for implementation details.
Hi, @flx42 , is it possible to provide a compose file which format version is 2.1? As lots of companies still use docker 1.12 in their cluster and they cannot upgrade their docker version to 17.0.6 in short term.
@azazhu are you running RHEL/Atomic's fork of Docker? If you do, you can just remove the runtime:
lines and it should work fine. That's the docker
package on RHEL/CentOS and probably other derivatives.
If that's not what you are running, you won't be able to make it work since the runtime
option requires format 2.3
:
https://docs.docker.com/compose/compose-file/compose-versioning/#version-23
Thx, @flx42, Could you check me if my understanding is correct or not:
- nvidia-docker can work with volta MPS even if we don't use docker compose file you provided, right?
- we just need a) nvidia-docker2; b) recommend to set EXCLUSIVE_PROCESS in host machine; c) start mps daemon(nvidia-cuda-mps-control in host machine; d) set CUDA_MPS_PIPE_DIRECTORY in host machine; e) make sure container can read the path of CUDA_MPS_PIPE_DIRECTORY by using -v; f) start container with "--ipc=host". Are my a,b, ~ e,f right?
- another question is: CUDA_MPS_ACTIVE_THREAD_PERCENTAGE should be set in container instead of host machine, right?
Yes, that should work. But you can also containerize the MPS daemon, like in the Docker Compose example. I need to document the steps with the docker CLI too.
another question is: CUDA_MPS_ACTIVE_THREAD_PERCENTAGE should be set in container instead of host machine, right?
IIRC you can set this value for the MPS daemon, or for all CUDA client apps. I think both work fine.
thx @flx42 , what do you mean by "containerize the MPS daemon"? To launch MPS daemon(nvidia-cuda-mps-control) on both host machine and container? In my experiment, I only launched nvidia-cuda-mps-control on host machine(i didn't launch it in container) and looks it works fine.
Yes, you can launch it inside a container or on the host. Both ways will work.
hi @flx42 ,
- it will be great if you can document the steps with the docker CLI, as I failed to launch docker-compose. I met "ERROR: could not find an available, non-overlapping IPv4 address pool among the defaults to assign to the network" in my work env. I tried to change the "bip" to avoid the subnet conflict, but still met the same error.
- I use the method I mentioned above and it can work, but it looks different from https://github.com/NVIDIA/nvidia-docker/wiki/MPS-(EXPERIMENTAL).
In
docker-compose.yml
, looks container has sys admin permission. So container can set gpu mode(to EXCLUSIVE_PROCESS) and launch mps demon by itself(pls correct me if my understanding is wrong). While the method I used is that gpu mode is set by host machine and mps demon is launched by host machine, and container doesn't have sys admin permission. Both methods can work, right?
@flx42 Does MPS support pascal GPU in nvidia-docker contrainers?
@GoodJoey not with the approach documented above, you would need a Volta GPU.
@flx42 In this wiki MPS , does Volta mean Volta Architecture or Volta GPU in sentence 'Only Volta MPS is supported' ? What's more, does 7.0 mean Compute Capability 7.0 in sentence 'NVIDIA GPU with Architecture >= Volta (7.0)' ? Forward your repley, thanks!
'
Seems like mps is not supported on the newest docker version. especially it's not --runtime=nvidia
but --gpus=all
now.
Also the missing support for docker-compose is annoying.
This example shows well that the containers have some kind of problem with cuda....
sudo CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=1 nvidia-cuda-mps-control -d #start deamon
docker run -it --rm -e NVIDIA_VISIBLE_DEVICES=1 --gpus=all --ipc=host tensorflow/tensorflow:2.1.0-gpu-py3 python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
echo quit | sudo nvidia-cuda-mps-control #shutdown deamon
Would really love to see "usable" support of mps with docker
Any update on this issue?
Seems like mps is not supported on the newest docker version. especially it's not
--runtime=nvidia
but--gpus=all
now. Also the missing support for docker-compose is annoying.This example shows well that the containers have some kind of problem with cuda....
sudo CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=1 nvidia-cuda-mps-control -d #start deamon docker run -it --rm -e NVIDIA_VISIBLE_DEVICES=1 --gpus=all --ipc=host tensorflow/tensorflow:2.1.0-gpu-py3 python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))" echo quit | sudo nvidia-cuda-mps-control #shutdown deamon
Would really love to see "usable" support of mps with docker
Hi, have you solved this problem?
@3XX0,
Does it mean that we can set and limit CUDA_MPS_ACTIVE_THREAD_PERCENTAGE for each container? Any examples of usage would really help.
Could you please elaborate what you mean by "better integration"?
Thank you
Hi, have you solved this problem? I want to set different CUDA_MPS_ACTIVE_THREAD_PERCENTAGE for each container, such as 3*30%and1*10% for a specific GPU.
any update?
We are working on a DRA Driver for NVIDA GPUs (https://github.com/NVIDIA/k8s-dra-driver) which will include better MPS support.
If there are use cases not covered by this (e.g. outside of K8s), please create an issue describing the use case against https://github.com/NVIDIA/nvidia-container-toolkit.