TensorRT
TensorRT copied to clipboard
Memory leak when building tensorrt engine for multiple time using loop while converting onnx model to tensorrt
Description
Memory leak when building tensorrt engine multiple time using loop, while converting onnx model to tensorrt. Once engine build, next time again building same engine (in loop), it is not releasing complete memory every time.
Environment
Docker image: nvcr.io/nvidia/tensorrt:22.04-py3 NVIDIA GPU: T4 NVIDIA Driver Version: 460.73.01 CUDA Version: 11.2
Relevant Files
Steps To Reproduce
trt_logger = trt.Logger(trt.Logger.ERROR)
def convert_onnx_tensorrt(path): tensorrt_engine = build_engine( runtime=trt.Runtime(trt_logger), onnx_file_path=path, logger=trt_logger, min_shape=(1, 512,), optimal_shape=(1, 512,), max_shape=(1, 512,), workspace_size=10000 * 1024 * 1024, fp16=True, int8=False, )
path = "mnist-8.onnx" for i in range(10): convert_onnx_tensorrt(path)
#1st run in loop memory consume ~ 217 MIB #2nd run in loop memory consume ~ 2.57 GiB #3rd run in loop memory consume ~ 4.66 GIB #4th run in loop memory consume ~ 5.71 GiB #5th run in loop memory consume ~ 6.76 GiB #6th run in loop memory consume ~ 7.4 GiB #7th run in loop memory consume ~ 7.8 GiB .....
It's more like a python memory handle issue rather than a TRT memory leak, I would suggest using c++ API if you want a finer memory control.
closing since no activity for more than 3 weeks, please reopen if you still have question, thanks!