server ONNX CUDA session not working in python backend

ONNX CUDA session not working in python backend

Open jsoto-gladia opened this issue 1 year ago • 2 comments

trafficstars

Bug Description The ONNX CUDA session is not working in the Python backend. When attempting to run inference using the ONNX model with CUDAExecutionProvider, the session fails to initialize or execute properly.

Triton Information Triton version: 22.07 Using Triton container: [Yes]

To Reproduce Steps to reproduce the behavior: https://github.com/jsoto-gladia/onnx-in-python-backend

When I use CPUExecutionProvider, everything works fine When I use CUDAExecutionProvider, i get the following lines of error

I1018 20:11:24.764594 1 python_be.cc:2248] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I1018 20:11:24.773726 1 python_be.cc:2087] TRITONBACKEND_ModelFinalize: delete model state
E1018 20:11:24.773855 1 model_lifecycle.cc:626] failed to load 'onnx_in_python_backend' version 1: Internal: Stub process 'onnx_in_python_backend_0' is not healthy.
I1018 20:11:24.773900 1 model_lifecycle.cc:755] failed to load 'onnx_in_python_backend'

Expected behavior The ONNX model should initialize and execute properly using the CUDAExecutionProvider, leveraging GPU acceleration for inference.

Oct 18 '24 20:10 jsoto-gladia

inside the container i have logged:

triton-server-all-1  | cuda: 12.1
triton-server-all-1  | cudnn: 90100
triton-server-all-1  | onnxruntime: 1.19.2
triton-server-all-1  | torch: 2.4.1+cu121
triton-server-all-1  | providers: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']

Oct 18 '24 20:10 ghost

on the machine when doing nvidia-smi, I get NVIDIA-SMI 560.28.03 Driver Version: 560.28.03 CUDA Version: 12.6

Oct 18 '24 20:10 ghost

I'm facing the same problem, are there any solutions?

Oct 23 '24 03:10 xiangkanghuang

Thanks for reporting the issue. when i tried to access the link to repro i couldn't open it. @jsoto-gladia can you please verify the link for the team to repro the issue.

Jan 22 '25 20:01 statiraju

server server copied to clipboard

ONNX CUDA session not working in python backend

server
server copied to clipboard