Zero Zeng
Zero Zeng
It's more like a python memory handle issue rather than a TRT memory leak, I would suggest using c++ API if you want a finer memory control.
@nvpohanh ^ ^
Can you share the onnx model here?
I can not reproduce this on my RTX 8000, TRT 8.4.1.5 with the official TRT docker image: nvcr.io/nvidia/tensorrt:22.07-py3, looks like your driver is pretty old, can you try upgrading your...
TRT8.4 still supports CUDA 10.2, did you download the CUDA 11 packages and use it in the CUDA 10 environment? 
> Do you have any suggestion? @kevinch-nv ^ ^ > Other thing, I would convert to TRT then apply on Jetson Xavier NX. Do I need convert TensorRT Engine directly...
Can we close it?
Our sampleMnist should work? I think it's a code issue, you can debug it using gdb or add some print.
TensorRT does support ViT, as you can see, you successfully convert the onnx to trt engine and you can see the inference latency and throughput in the log of trtexec.
I don't have the env for the notebook, but I try to reproduce it like this: ``` #trt.py import tensorrt as trt import pycuda.driver as cuda import pycuda.autoinit import numpy...