TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Results 628 TensorRT issues
Sort by recently updated
recently updated
newest added

This engine was built in fp16 mode. And tensorrt infer results is right. Just wonder how tensorrt know when to stop the infer as ther is a circle in below...

triaged

I want to implement block quantization through tensorflow-quantization, what process should I need to follow? Or can you support a simple case?

triaged
Module:Quantization
Investigating

I am performing QAT quantization on the HRNet OCR model and using TensorRT 8.6.2 to convert and quantize the generated ONNX model with QDQ operations. After conversion, I found that...

triaged
Module:Quantization

## Description I am performing QAT quantization on a complex model. When I insert Q/DQ nodes into the ResNet portion I want to quantize according to the rules, TensorRT can...

triaged
Module:Quantization

## Description **I tried to perform inference time statistics for the segmentation model on my machine(bisenetv2) between TRT-8.5.3.1 VS TRT-10.5.0.18. But I found a big difference in inference speed between...

Module:Performance

## Description I used the following commands to convert an ONNX model to a TRT engine, where the input.onnx file is the original model: ``` polygraphy surgeon sanitize --fold-constants ./input.onnx...

triaged

Environment: • TensorRT Version: 10.9.0.34 • GPU Type: NVIDIA GeForce RTX 3090 (24GB VRAM) • Nvidia Driver Version: 572.83 • CUDA Version: 12.8.1 • CUDNN Version: 9.8.0 • Operating System:...

Module:Engine Build
triaged

![Image](https://github.com/user-attachments/assets/f10bc335-a18c-4f79-a6c8-c00752dda788) ![Image](https://github.com/user-attachments/assets/a2c8e57e-6748-4e0c-82c8-e06195203a5b) I would like to ask, considering using TensorRT plugin or CUDA Kernel to implement more efficient Argmax operations, will it be faster?

triaged

Hello team, Thanks for all the great work, I am training a model where I am providing tile-wise constant attention masks (see picture below). At inference time, how will TensorRT...

triaged