TensorRT
TensorRT copied to clipboard
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
# GPU: 4090 [06/18/2025-03:48:34] [I] TensorRT version: 10.10.0 [06/18/2025-03:48:34] [I] Loading standard plugins [06/18/2025-03:48:34] [I] [TRT] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 26, GPU 390 (MiB) [06/18/2025-03:48:37]...
## Description When running inference on a TensorRT engine built from an ONNX model, I observe significant discrepancies between TensorRT and ONNX Runtime outputs. The difference is not minor -...
## Description I’m in the process of migrating from TensorRT 8.6 to 10.3. Following the migration guide provided in the documentation, I was able to get inference working on 10.3....
after converting onnx fp32 to int8 engine with custom calibration, the engine layers still show fp32
## Description I tried to follow the int8 custom calibration to build my int8 engine from onnx fp32 model. https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy/examples/cli/convert/01_int8_calibration_in_tensorrt After building the engine, I used the following to inspect...
CUDA_VISIBLE_DEVICES=0,1 python3 demo_img2vid.py --version svd-xt-1.1 --onnx-dir onnx-svd-xt-1-1 --engine-dir engine-svd-xt-1-1 --hf-token=$HF_TOKEN --batch-size=1 --use-cuda-graph /usr/local/lib/python3.12/dist-packages/modelopt/torch/utils/import_utils.py:25: UserWarning: Failed to import apex plugin due to: ImportError("cannot import name 'UnencryptedCookieSessionFactoryConfig' from 'pyramid.session' (unknown location)") warnings.warn(f"Failed...
Layer fails to execute to run on DLA when dynamic shaped ONNX of batch 4 is built, but same static onnx model with implicit batch of the same runs on...
Hello is there any doable way to convert the detectron2 faster rcnn + fpn model to tensor rt engine? I found a tutorial and scripts about conversion of detectron2 mask...
I am trying to convert an SigLIP2 model to TensorRT and use fp16, but the cosine similarity between onnx and trt is 0.6463. I used the following code convert to...
## Description We successfully run inference with our model and observe some stability issues. After hours / days of runtime IExecutionContext:: enqueue(V2/V3) suddenly starts returning false and does not recover...
Currently, sensevoice's trt engine can be successfully converted through trtexec, but when running the benchmark infer, an error message is displayed as shown below:  ORT can be used to...