TensorRT
TensorRT copied to clipboard
have problem for converting ONNX model to TRT
Description
Hi, I wanna convert my ONNX model to TRT. I use the below command to convert this model:
/usr/src/tensorrt/bin/trtexec --onnx=model_folded.onnx --verbose --explicitBatch --saveEngine=model.trt
After I searched for this problem, a collaborator from Nvidia said that it may solve by the below command. In addition he says that maybe folding it may solve this problem.
polygraphy surgeon sanitize model.onnx --fold-constants --output model_folded.onnx
After that for both model I have issue and can't convert to TRT.
The error is:
[08/09/2022-07:55:01] [V] [TRT] --------------- Timing Runner: {ForeignNode[onnx::Add_746 + (Unnamed Layer* 20) [Shuffle]...Transpose_245 + (Unnamed Layer* 459) [Shuffle]]} (Myelin)
[08/09/2022-07:55:02] [E] Error[1]: [graphContext.h::~MyelinGraphContext::35] Error Code 1: Myelin (no further information)
[08/09/2022-07:55:02] [W] [TRT] Skipping tactic 0x0000000000000000 due to Myelin error: cuBLAS initialization failed: 3.
[08/09/2022-07:55:02] [V] [TRT] Fastest Tactic: 0xd15ea5edd15ea5ed Time: inf
[08/09/2022-07:55:02] [V] [TRT] Deleting timing cache: 164 entries, served 8188 hits since creation.
[08/09/2022-07:55:02] [E] Error[10]: [optimizer.cpp::computeCosts::3626] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[onnx::Add_746 + (Unnamed Layer* 20) [Shuffle]...Transpose_245 + (Unnamed Layer* 459) [Shuffle]]}.)
[08/09/2022-07:55:02] [E] Error[2]: [builder.cpp::buildSerializedNetwork::636] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
[08/09/2022-07:55:02] [E] Engine could not be created from network
[08/09/2022-07:55:02] [E] Building engine failed
[08/09/2022-07:55:02] [E] Failed to create engine from model or file.
[08/09/2022-07:55:02] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8402] # /usr/src/tensorrt/bin/trtexec --onnx=model_folded.onnx --verbose --explicitBatch --saveEngine=model.trt
`
Environment
TensorRT Version: 8.4.2.4-1+cuda11.6 NVIDIA GPU: Tesla-P100 NVIDIA Driver Version: 440.33.01 CUDA Version: 11.3 CUDNN Version: 8.4.0.27 Operating System: Ubuntu 20.04.2 LTS Python Version (if applicable): 3.8.10 Tensorflow Version (if applicable): 2.9.1 PyTorch Version (if applicable): 1.9.0a0+c3d40fd
Looks like a bug, can you share the onnx model here?
@zerollzeng Unfortunately no. It's 485Mb and I can't upload it.
Looks like a CUBLAS_STATUS_ALLOC_FAILED:
cuBLAS initialization failed: 3
Maybe you're running out of memory on your GPU? Do other networks work on this GPU? I'm wondering if it could be a driver/setup issue.
@pranavm-nvidia No. When I run the trtexec, I checked the memory of my GPU each second. It didn't use all of the memory. Also I checked other things like RAM. But all was ok.
@zerollzeng Unfortunately no. It's 485Mb and I can't upload it.
You can upload it to Google Drive and share the link here.
@zerollzeng Yeah I know that but I don't have permission. It's commercial.
I would suspect this is a Myelin bug, @jackwish for viz.
As @pranavm-nvidia mentioned above cuBLAS initialization failed: 3 is likely to be setup issue.
CUDA 11.x requires CUDA driver >= 450.80.02* according to the CUDA compatibility doc while your setup is CUDA 11.3 + driver 440.33.01.
@alexandercesarr Could you please upgrade your driver to an appropriate version (suggest to use the one bunddled in the CUDA toolkit package)? To isolate similar issues, we suggest to starting with the same CUDA version of the TensorRT builds, i.e. CUDA 11.6 in your case.
Hi, sorry for my late reply. That was it. I upgraded my driver and it ran without any problem. Thanks @zerollzeng & @jackwish
Glad to hear that! Now close this issue. Please let us know if any further issues.