TensorRT issues

Size of model and inference time is same as FP32 after calibration/quatization step.

8

## Description I was trying to follow along this: 1) https://github.com/NVIDIA/TensorRT/blob/master/tools/pytorch-quantization/examples/calibrate_quant_resnet50.ipynb 2) https://github.com/NVIDIA/TensorRT/blob/master/tools/pytorch-quantization/examples/finetune_quant_resnet50.ipynb As I understand, step 1 should result in a quantized INT8 model. So I should expect a...

SM1991CODES

triaged

How use TensorRT API to add conv1d layers?

3

## Description I want to use TensorRT API to rebuild a model, which contains some conv1d layers [`use torch.Conv1d()`]. Inputs shape is [1,1,82000], and the output of conv1d is [1,512,16399],...

oreo-lp

triaged

DLA for Digit Recognition With Dynamic Shapes

1

## Description No layer in sampleDynamicReshape does not run on DLA. Is there any easy way to run these layers on DLA? ## Environment JetPack 5.0.1 on AGX Orin and...

ismetdagli

triaged

sampleNMT on DLA

1

## Description No layer in sampleNMT does not run on DLA. Is there any easy way to run these layers on DLA? ## Environment JetPack 5.0.1 on AGX Orin and...

ismetdagli

triaged

TensorRT cannot support nonzero，Is there any other alternative to this function?

7

the code in Pytorch is ：indices = mask.nonzero(as_tuple=False)[:, 0]， the error in TensorRT is： [08/23/2022-10:06:28] [I] [TRT] No importer registered for op: NonZero. Attempting to import as plugin. [08/23/2022-10:06:28] [I]...

puallee

triaged

Performance of structured sparsity for Inceptionv3 on A6000 GPU

5

Hi, I trained and ran the dense and 2:4 structured sparse model for the InceptionV3 model on the NVIDIA A6000 GPU. However, I observe marginal performance improvement and even worse...

shivmgg

triaged

Input image size less than 800px affecting performance for Detectron2 Mask R-CNN TensorRT Compilation

7

## Description In the official [Detectron 2 Mask R-CNN R50-FPN 3x TensorRT compilation script](https://github.com/NVIDIA/TensorRT/tree/main/samples/python/detectron2), the number of predictions the model makes on an image is none if the input image...

barathsku

Samples

triaged

Can not convert efficientdet-d7 from AutoML model to onnx model

3

## Description when I convert fficientdet-d7x from AutoML Models as https://github.com/NVIDIA/TensorRT/tree/main/samples/python/efficientdet, I get "INVALID_GRAPH : This is an invalid model. Error in Node:strided_slice_1 : Node (strided_slice_1) has input size 0...

HeXCZ1028

triaged

ModelImporter.cpp:750: input: "onnx::Range_464"

3

## Description Hi, I am trying to convert yolov5m model into tensort model for batch inference (batch size of 4). but I am getting an error while `trexec` conversion. This...

Nuwan1654

triaged

Two of the ForeignNodes consumes 60% inference time, among 1000 nodes

3

## Description When running the tensorrt engine and do profiling (using `trtexec`), we found that two of the `ForeignNode` takes `60%` inference time. And the total number of nodes in...

CanyonWind

triaged

Topic: Myelin

TensorRT
TensorRT copied to clipboard

Metadata

Size of model and inference time is same as FP32 after calibration/quatization step.

How use TensorRT API to add conv1d layers?

DLA for Digit Recognition With Dynamic Shapes

sampleNMT on DLA

TensorRT cannot support nonzero，Is there any other alternative to this function?

Performance of structured sparsity for Inceptionv3 on A6000 GPU

Input image size less than 800px affecting performance for Detectron2 Mask R-CNN TensorRT Compilation

Can not convert efficientdet-d7 from AutoML model to onnx model

ModelImporter.cpp:750: input: "onnx::Range_464"

Two of the ForeignNodes consumes 60% inference time, among 1000 nodes

← Metadata

Owner

Metadata

TensorRT TensorRT copied to clipboard

Metadata

← Metadata

Owner

Metadata

TensorRT
TensorRT copied to clipboard