nvidia-docker
nvidia-docker copied to clipboard
when I pointed device=3 of gpu, but it only used the first gpu(device=1)
Host Ubuntu 18.04 Docker 20.10.1
When I run tensorflow/serving:2.6.0-gpu with all gpus, but it only use one.
sudo docker run -p 8500:8500 -p 8501:8501 --gpus 4 --mount type=bind,source=/home/building,target=/models/building -e MODEL_NAME=building -t tensorflow/serving:2.6.0-gpu --enable_batching=true --batching_parameters_file=/models/building/batching_parameters.txt &
What's more, when I pointed device=3 of gpu, but it only used the first gpu(device=1).
docker run --gpus '"device=3"'
Hi @PingYufeng could you please provide the output of nvidia-smi
on the host as well as in the container for the different situations that you are describing -- especially in the case where device=3
is selected but only device 1 is available in the container.
Looking at some TF Serving resources on the web, it seems as if it is specifically targeted at a single GPU use case (see for example https://stephenweixu.medium.com/serving-multiple-ml-models-on-multiple-gpus-with-tensorflow-serving-fe2ade7aa16b). This seems to indicate that the "first" GPU visible to the container will most likely always be used.
The question of the behaviour when you select a specific GPU is still valid.