TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

Allocating TensorRT model on GPU consumes all available Memory

Open sam-h-bean opened this issue 3 years ago • 1 comments

Description

When I llocate a TensorRT model on GPU it consumes all available memory on the device. I have tried instantiating the model with an empty GPU and with another model on the GPU and both work but fill the CUDA memory to full. This seems to indicate that TensorRT is consuming all available memory on GPU.

Environment

TensorRT Version: 8.4.1.5 NVIDIA GPU: Tesla V100 NVIDIA Driver Version: 470.57.02 CUDA Version: 11.7 CUDNN Version: Operating System: Ubuntu Python Version (if applicable): 3.8 PyTorch Version (if applicable): 1.12.0+cu102 Baremetal or Container (if so, version): nvcr.io/nvidia/tensorrt:22.06-py3 but pip install nvidia-tensorrt==8.4.1.5

Steps To Reproduce

from optimum.onnxruntime import ORTQuantizer
from optimum.onnxruntime.modeling_ort import ORTModelForCausalLM
from optimum.onnxruntime.configuration import AutoQuantizationConfig
from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained('distilgpt2')
tokenizer = AutoTokenizer.from_pretrained('distilgpt2', padding=True, truncation=True)

qconfig = AutoQuantizationConfig.tensorrt(is_static=False, per_channel=False)
quantizer = ORTQuantizer(model=model, preprocessor=tokenizer, feature="causal-lm")
quantizer.export(
    onnx_model_path="model.onnx",
    onnx_quantized_model_output_path="model-quantized.onnx",
    quantization_config=qconfig,
    use_external_data_format=True
)

tensorrt_model = ORTModelForCausalLM.load_model("./model-quantized.onnx", provider="TensorrtExecutionProvider")

sam-h-bean avatar Jul 14 '22 07:07 sam-h-bean

I don't see TRT is involved in your Step to Reproduce. More like an issue on optimum or transformers

BTW can you try using trtexec to convert your model? e.g. trtexec --onnx=model-quantized.onnx --int8 --fp16 --saveEngine=model-quantized.onnx.plan

zerollzeng avatar Jul 15 '22 16:07 zerollzeng

closing since no activity for more than 3 weeks, please reopen if you still have question, thanks!

ttyio avatar Dec 06 '22 01:12 ttyio