TensorRT
                                
                                 TensorRT copied to clipboard
                                
                                    TensorRT copied to clipboard
                            
                            
                            
                        NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
### Description Fix duplicated input node issue in TensorRT 8.6 ONNX export. This PR addresses a problem where exporting ONNX models using TensorRT 8.6 results in two input nodes being...
My pytorch and onnx model has an uint8 to fp32 cast layer which divides by 255. This cast layer is applied to the input tensor. When i convert the onnx...
## Description I'm trying to generate a calibration cache file for post-training-quantizatio using Polygraphy. For which I created custom input json file referring to this [https://github.com/NVIDIA/TensorRT/blob/main/tools/Polygraphy/how-to/use_custom_input_data.md]. The input shape of...
## Description [09/12/2024-14:32:23] [TRT] [E] 1: [defaultAllocator.cpp::nvinfer1::internal::DefaultAllocator::deallocate::42] Error Code 1: Cuda Runtime (invalid argument) [09/12/2024-14:32:23] [TRT] [E] 1: [defaultAllocator.cpp::nvinfer1::internal::DefaultAllocator::deallocate::42] Error Code 1: Cuda Runtime (invalid argument) [09/12/2024-14:32:23] [TRT] [E] 1:...
Hello, how can I build trtexec in Windows 11? Tried with cmake but it's giving plenty of errors... Thanks in advance, Joan
I have convert my onnx model to tensorrt, however the result is quit strange. My model is trained in mix precision,when I add the following line, will convert to fp16...
## Description ## Environment **TensorRT Version**: 10.4.0.26-1+cuda12.6 (upgrading from 10.3) **NVIDIA GPU**: V100 **NVIDIA Driver Version**: **CUDA Version**: Cuda compilation tools, release 12.5, V12.5.82 **CUDNN Version**: 9 Operating System: Python...
## Description I have two different module and convert to trt. when I run them in Serial. the cost time of only infer: ``` //10 times do_infer >> cost 400.60...
I am working on statically building TensorRT on my Windows system. The goal is to reduce the size of my program by eliminating the need for dynamic libraries (DLLs) and...
I keep having issues when compiling apps that requires CUDA and C++ tools on windows I would like to learn best version for CUDA 11.8 and CUDA 12.4 There are...