Zero Zeng
Zero Zeng
It should still work in 8.4 but will be deprecated in the future. more info: https://docs.nvidia.com/deeplearning/tensorrt/release-notes/tensorrt-8.html#rel-8-4-0-EA
After exporting to onnx, can you run the model with trtexec? I would suspect the torch and TRT may use different cuda libraries.
Can you share the onnx model here?
Looks like there are similiar issue, https://github.com/NVIDIA/TensorRT/issues/1818 and https://github.com/NVIDIA/TensorRT/issues/2123 Can you check your cublaslt version in the log?
also https://github.com/NVIDIA/TensorRT/issues/866
Can you try remove ``` import torch import torchvision.models as models ``` and all the torch stuff in your script? only leave the trt part. like ``` import os import...
I didn't reproduce this in my environment with cuda 11.6, also seems you miss import pycuda.autoinit. can you try to upgrade the cuda 11? ``` import pycuda.driver as cuda import...
my code ``` import os import numpy as np import pycuda.driver as cuda import pycuda.autoinit import tensorrt as trt import time # build engine with trtexec BATCH_SIZE=32 target_dtype = np.float16...
I haven't used it before but I guess https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/ExecutionContext.html#tensorrt.IExecutionContext.report_to_profiler is the answer, @nvpohanh may know more about it.