TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Results 628 TensorRT issues
Sort by recently updated
recently updated
newest added

pytorch is now support flash attention v2, which is 2 times faster than flash attention: https://pytorch.org/blog/pytorch2-2/ So I'm wondering if tensorrt 9.2 already support flash attention v2, or I have...

triaged

## Description I customized TensorRT's Col2Im plugin, recompiled the source code of TensorRT8.5, and generated a new nvinfer_plugin library. ## Environment **TensorRT Version**: 9.2.0.5 **NVIDIA GPU**: GeForce GTX 1080 Ti...

ONNX
triaged

## Description I am trying to convert sligtly modified version of [YOSO](https://github.com/hujiecpp/YOSO) from pytorch to TRT. I cannot make it work with batch size 8. Can you please point me...

triaged

Use tensorrt inference bert, speed slow than onnxruntime,tensorrt is 10ms,onnx is 6ms,model just simple bert classification model. Could some one help me? onnx code ``` import numpy as np import...

triaged

## Description I tried to convert my onnx model to .trt but trtexec segfaulted. See attached log output of trtexec ... the program segfaults after the final line you see...

triaged

Hello, thanks for all the great work ! Some of my models require bfloat16 at inference time, I saw it was added in TensorRT 9 with TensorRT-LLM, and I was...

triaged

## Description When I'm comparing Multihead Attention between Torch2.2 and TensorRT 9.2 on A100-SXM4-40G, I found that for certain size the result engine does not use `_gemm_mha_v2` tactics. When not...

triaged
internal-bug-tracked

## Description For the tmp values are precomputed for re-use, tmp is calculated as below: https://github.com/NVIDIA/TensorRT/blob/78245b0ac2af9a208ed02e5257bfd3ee7ae8a88d/plugin/disentangledAttentionPlugin/disentangledKernel.cu#L122 The sequence length `dimResult.y` is wrongly used as max relative position. But according to...

triaged
internal-bug-tracked

## Description Use trtexec convert an onnx model to trt failed, but no more error information, how to solve it? ```bash [02/20/2024-10:56:21] [E] Error[2]: Assertion engine failed. [02/20/2024-10:56:21] [E] Error[2]:...

triaged

The code below shows that the numpy part works perfectly, but using torch's gpu tensor will report an error. My actual usage scenario is to decode video using vpf first,...

triaged