Can't Deploy Torchserve ONNX with GPU
🐛 Describe the bug
Disclaimer
- Running Torchserve + ONNX + CPU is fine.
- I am aware that there is an open issue regarding a similar situation below, but I am confronting something more.
- https://github.com/pytorch/serve/issues/2425
Problem
- Can't Deploy Torchserve ONNX with GPU
Error logs
-
Using the built images from any cuda
runtimeorbaseimages will havepython3 -c "import torch; print(torch.cuda.is_available())"returning False For example./build_image.sh -bi nvidia/cuda:11.6.2-cudnn8-runtime-ubuntu20.04 -t torchserve_cu116./build_image.sh -bi nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu20.04 -t torchserve_cu118Sidenote: Specifying cuda versions e.g../build_image.sh --cv 113often results in an image not found error, probably due to bashscript not being updated. -
torchserve:0.8.*-gpuimages results in aFailed to create CUDAExecutionProvidererror whilepython3 -c "import torch; print(torch.cuda.is_available())"returns True
[W:onnxruntime:Default, onnxruntime_pybind_state.cc:578 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met.
-
torchserve:0.7.1-gpu: The only functioning image -
torchserve:0.7.0-gpuor below: Assumes the model is a pytorch module and returnsattributeerror: 'InferenceSession' object has no attribute 'eval'; GPU utilization, however, is fine
Installation instructions
Tried both downloading official torchserve images and building source images based of nvidia's base image.
Model Packaing
ONNX handler: https://gist.github.com/andy971022/19ed36022470f099c08ff28c20422244 Dockerfile: https://gist.github.com/andy971022/d11bf90fa4d3e0da37e8ee6ff9538acc
config.properties
inference_address=http://0.0.0.0:7080
management_address=http://0.0.0.0:7081
metrics_address=http://localhost:7082
service_envelope=json
model_store=model-store
Versions
From the notebook environment
------------------------------------------------------------------------------------------
Environment headers
------------------------------------------------------------------------------------------
Torchserve branch:
torchserve==0.8.1
torch-model-archiver==0.8.1
Python version: 3.7 (64-bit runtime)
Python executable: /opt/conda/bin/python
Versions of relevant python libraries:
captum==0.6.0
numpy==1.21.6
nvgpu==0.10.0
open-clip-torch==2.20.0
pillow-avif-plugin==1.3.1
psutil==5.9.3
requests==2.31.0
requests-oauthlib==1.3.1
sentencepiece==0.1.99
torch==1.13.1
torch-model-archiver==0.8.1
torch-workflow-archiver==0.2.9
torchserve==0.8.1
torchvision==0.14.1
transformers==4.30.0
types-requests==2.30.0.0
wheel==0.40.0
torch==1.13.1
**Warning: torchtext not present ..
torchvision==0.14.1
**Warning: torchaudio not present ..
Java Version:
OS: Debian GNU/Linux 10 (buster)
GCC version: (Debian 8.3.0-6) 8.3.0
Clang version: N/A
CMake version: version 3.13.4
Is CUDA available: Yes
CUDA runtime version: 11.3.109
GPU models and configuration:
GPU 0: Tesla T4
GPU 1: Tesla T4
Nvidia driver version: 510.47.03
cuDNN version: None
Repro instructions
-
Have onnx-handler and any onnx e.g.
visual.onnxindir-containing-onnx-assets/ -
Download the docker from gist
-
docker build -t ts-test . -
docker run --gpus all -p 7080:7080 ts-test
Possible Solution
No response
Yeah we are prioritizing a larger dev image that would have all these dependencies @agunapal
Hey, the image still does not have cudnn support @msaroufim