docker-py
docker-py copied to clipboard
create_container not support --gpus param
docker version: 19.03
I want to set --gpus all when create container ,but found docker-py not support this param.
Hello @ffteen thank you for the report
any progress on this issue?
I think one hacky way, though not very reliable, is to use the low-level Api and overwrite the host configuration. Since I only tried to follow the docker cli code in go, I'm not sure how reliable/portable this solution is. It works on my machine and I thought it might help someone until the official support is implemented.
The following code is a modification of the original DockerClient.containers.create() function, that adds a DeviceRequest to the host configuration and otherwise works exactly like the original function:
import docker
from docker.models.images import Image
from docker.models.containers import _create_container_args
def create_with_device_request(client, image, command, device_request=None, **kwargs):
if isinstance(image, Image):
image = image.id
kwargs['image'] = image
kwargs['command'] = command
kwargs['version'] = client.containers.client.api._version
create_kwargs = _create_container_args(kwargs)
# modification to the original create function
if device_request is not None:
create_kwargs['host_config']['DeviceRequests'] = [device_request]
# end modification
resp = client.api.create_container(**create_kwargs)
return client.containers.get(resp['Id'])
# Example usage
device_request = {
'Driver': 'nvidia',
'Capabilities': [['gpu'], ['nvidia'], ['compute'], ['compat32'], ['graphics'], ['utility'], ['video'], ['display']], # not sure which capabilities are really needed
'Count': -1, # enable all gpus
}
container = create_with_device_request(docker.from_env(), 'nvidia/cuda:9.0-base', 'nvidia-smi', device_request, ...)
I think the cli client sets the NVIDIA_VISIBLE_DEVICES environment variable, so it's probably a good idea to do the same with environment={'NVIDIA_VISIBLE_DEVICES': 'all'} as parameter of the create_with_device_request() call.
This enables all available gpus. You could modify this with different device_requests:
# enable two gpus
device_request = {
'Driver': 'nvidia',
'Capabilities': ...,
'Count': 2, # enable two gpus
}
# enable gpus with id or uuid
device_request = {
'Driver': 'nvidia',
'Capabilities': ...,
'DeviceIDs': ['0', 'GPU-abcedfgh-1234-a1b2-3c4d-a7f3ovs13da1'] # enable gpus with id 0 and uuid
}
The environment parameter should then look like {'NVIDIA_VISIBLE_DEVICES': '0,1'} respectively {'NVIDIA_VISIBLE_DEVICES': '0,GPU-xxx'}
I‘m not sure which capabilities are really needed too!
Does create_service support device request param?
I use nvidia runtime instead.
As far as I can tell, services.create() does not support device requests.
Setting runtime='nvidia' is definitely the better approach, if possible.
The problem I had was, that I use the nvidia-container-toolkit which does not require to install the nvidia-runtime, so setting nvidia runtime leads to Error: unknown runtime specified nvidia, while using --gpus=all works as expected.
Is there a better way to use nvidia-gpus with the nvidia-container-toolkit?
I have a change (that appears to work) that allows the "gpus" option in my fork. I'd like to create a PR for it, but when running the tests, this error (which is unrelated to the change) occurs:
tests/integration/api_service_test.py:379:53: F821 undefined name 'BUSYBOX' Makefile:92: recipe for target 'flake8' failed
Is there a package that needs to be installed to fix this?
@hnine999 No, that's an error on our end - we'll fix it shortly. Feel free to submit your PR in the meantime!
The PR from @hnine999 is #2419
Hi - Any update with this feature?
Any update on this? It is badly needed. docker-py is functionally broken for running GPU enabled containers.
+1
this is actually a major feature for all data science community that runs tensorflow in docker on nvidia GPUs in the cloud. Why is this ignored for such a long time? 😞
Any update on this?
Still waiting for this to be supported... The only workaround for now is "docker run" with bash :(
On Thu, Mar 12, 2020, 02:11 bluebox42 [email protected] wrote:
Any update on this?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/docker/docker-py/issues/2395#issuecomment-598081853, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHPL6ATC2GVI56J6F4IQ5DRHCRNRANCNFSM4IIP5BUA .
Still waiting for this to be supported... The only workaround for now is "docker run" with bash :(
At the moment, nvidia-container-toolkit still includes nvidia-container-runtime. So, you can still add nvidia-container-runtime as a runtime in /etc/docker/daemon.json:
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
Then restart the docker service (sudo systemctl restart docker) and use runtime="nvidia" in docker-py as before.
Thanks a bunch - that works BUT the daemon.json is missing a double quote in runtimes: { "runtimes": { "nvidia": { "path": "nvidia-container-runtime", "runtimeArgs": [] } } }
Is there a solid fix for this issue?
Thanks - updated my comment with that suggestion
Hi @jmsmkn I installed nvidia-container-toolkit in arch, but it does not come with nvidia-container-runtime. Any update with this? Thanks.
cd /usr/bin
ls | grep nvidia
nvidia-bug-report.sh
nvidia-container-cli
nvidia-container-runtime-hook
nvidia-container-toolkit
nvidia-cuda-mps-control
nvidia-cuda-mps-server
nvidia-debugdump
nvidia-modprobe
nvidia-persistenced
nvidia-settings
nvidia-sleep.sh
nvidia-smi
nvidia-xconfig
Simple "gpus=" keyword parameter, please !
Need this feature supported badly for lots people who are dealing data with GPU for AI and HPC. Please add this feature as soon as you guys can, we'll be very grateful.
Is this issue on some agenda? (This is your second most upvoted open issue at the moment.)
Hi all, I made a Python client for Docker that sits on top of the Docker client binary (the one written in go). It took me several months of work. It notably has support for gpus in docker.run(...) and docker.container.create(...), with all options that the CLI has.
It's currently only available for my sponsors, but It'll be open source with an MIT licence May 1st, 2021 🙂
https://gabrieldemarmiesse.github.io/python-on-whales/
Hi all, in the end, making Python-on-whales pay-to-use wasn't a success. So I've open-sourced it.
It's free and on Pypi now. Have fun 😃
$ pip install python-on-whales
$ python
>>> from python_on_whales import docker
>>> print(docker.run("nvidia/cuda:11.0-base", ["nvidia-smi"], gpus="all"))
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
https://github.com/gabrieldemarmiesse/python-on-whales
looks good!
On Wed, Dec 2, 2020 at 2:30 PM Gabriel de Marmiesse < [email protected]> wrote:
Hi all, in the end, making Python-on-whales pay-to-use wasn't a success. So I've open-sourced it.
It's free and on Pypi now. Have fun 😃
$ pip install python-on-whales
$ python
from python_on_whales import docker print(docker.run("nvidia/cuda:11.0-base", ["nvidia-smi"], gpus="all")) +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. |
|===============================+======================+======================| | 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 | | N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage |
|=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
https://github.com/gabrieldemarmiesse/python-on-whales
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/docker/docker-py/issues/2395#issuecomment-737446886, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHPL6CZROSUA3THJNPFA3TSS2IVHANCNFSM4IIP5BUA .
In the end, I have just written a very simple wrapper around subprocess.run, with a built arg_list that can include the required GPU parameter, that captures stdout and stderr and the return code, and the execution duration.
Incidentally I have found that the AWS ML AMI works well with Docker/nVidia, with no further tricky configuration required. All I would say is to fire up an instance using the AMI, do the required apt update/upgrades, then freeze /that/ as your AMI to use; it avoids a 5-minute delay ! For my purposes, a root volume of 200GB works fine, as opposed to the vast default root volumes you get with the g3/g4 instances (maybe required if you are going to hibernate). But am going a bit off-topic !
Hello team, is this a feature that you are thinking of adding? It would be of great value
@JoanFM I guess, this functionality has already been implemented:
client.containers.run(
'nvidia/cuda:9.0-base',
'nvidia-smi',
device_requests=[
docker.types.DeviceRequest(count=-1, capabilities=[['gpu']])
]
)
Not very elegant, but it works
@matyushinleonid Thanks heaps! it worked
@JoanFM I guess, this functionality has already been implemented:
client.containers.run( 'nvidia/cuda:9.0-base', 'nvidia-smi', device_requests=[ docker.types.DeviceRequest(count=-1, capabilities=[['gpu']]) ] )Not very elegant, but it works
This works! This is the only solution that actually works, thanks so much! :)
