TensorRT
TensorRT copied to clipboard
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
## Description I was trying to follow along this: 1) https://github.com/NVIDIA/TensorRT/blob/master/tools/pytorch-quantization/examples/calibrate_quant_resnet50.ipynb 2) https://github.com/NVIDIA/TensorRT/blob/master/tools/pytorch-quantization/examples/finetune_quant_resnet50.ipynb As I understand, step 1 should result in a quantized INT8 model. So I should expect a...
## Description I want to use TensorRT API to rebuild a model, which contains some conv1d layers [`use torch.Conv1d()`]. Inputs shape is [1,1,82000], and the output of conv1d is [1,512,16399],...
## Description No layer in sampleDynamicReshape does not run on DLA. Is there any easy way to run these layers on DLA? ## Environment JetPack 5.0.1 on AGX Orin and...
## Description No layer in sampleNMT does not run on DLA. Is there any easy way to run these layers on DLA? ## Environment JetPack 5.0.1 on AGX Orin and...
the code in Pytorch is :indices = mask.nonzero(as_tuple=False)[:, 0], the error in TensorRT is: [08/23/2022-10:06:28] [I] [TRT] No importer registered for op: NonZero. Attempting to import as plugin. [08/23/2022-10:06:28] [I]...
Hi, I trained and ran the dense and 2:4 structured sparse model for the InceptionV3 model on the NVIDIA A6000 GPU. However, I observe marginal performance improvement and even worse...
## Description In the official [Detectron 2 Mask R-CNN R50-FPN 3x TensorRT compilation script](https://github.com/NVIDIA/TensorRT/tree/main/samples/python/detectron2), the number of predictions the model makes on an image is none if the input image...
## Description when I convert fficientdet-d7x from AutoML Models as https://github.com/NVIDIA/TensorRT/tree/main/samples/python/efficientdet, I get "INVALID_GRAPH : This is an invalid model. Error in Node:strided_slice_1 : Node (strided_slice_1) has input size 0...
## Description Hi, I am trying to convert yolov5m model into tensort model for batch inference (batch size of 4). but I am getting an error while `trexec` conversion. This...
## Description When running the tensorrt engine and do profiling (using `trtexec`), we found that two of the `ForeignNode` takes `60%` inference time. And the total number of nodes in...