TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Results 599 TensorRT issues
Sort by recently updated
recently updated
newest added

Significant output differences when compiling and running the `facebook/bart-base` (https://huggingface.co/facebook/bart-base) model with Torch-TensorRT, even after applying FP16 and various precision settings. Compare the output using the following code: ```python import...

# Description When compiling `facebook/bart-base` with Torch-TensorRT, I encountered an error similar to the one in [this issue](https://github.com/pytorch/TensorRT/issues/3184), where `aten_ops.scatter.src` fails within `impl.elementwise.eq`. Upon investigation, I found that the issue...

component: conversion
component: api [Python]
cla signed
component: dynamo

This PR illustrates the use of nccl ops from TRT-LLM for the example `examples/distributed_inference/tensor_parallel_simple_example.py`

component: lowering
component: api [Python]
cla signed
component: dynamo

## ❓ Question Since only part of the ops support dynamic shapes, and some are not. What's the criteria to decide if an op supports dynamic shape or not? For...

question

# Description A graph module's output might have nested structures depending on the implementation. For example, many models from transformers returns output of type [ModelOutput](https://github.com/huggingface/transformers/blob/c409cd81777fb27aadc043ed3d8339dbc020fb3b/src/transformers/utils/generic.py#L310) (e.g. [CausalLMOutputsWithPast](https://github.com/huggingface/transformers/blob/c409cd81777fb27aadc043ed3d8339dbc020fb3b/src/transformers/modeling_outputs.py#L678)). This PR doesn't...

component: conversion
component: api [Python]
cla signed
component: dynamo

## Bug Description I'm trying to serve torch-tensorrt optimized model to Nvidia Triton server based on the provided tutorial https://pytorch.org/TensorRT/tutorials/serving_torch_tensorrt_with_triton.html First the provided script to generate optimized model does not...

bug

# Description The cross compile for windows change has added the following new interface: **1) c++ side** added setup_engine() interface moved base64_encode/decode from register_jit_hooks.cpp to runtime.cpp since it is being...

component: tests
component: conversion
component: core
component: api [Python]
component: runtime
cla signed
component: dynamo

## Bug Description When using engine cache feature on Llama2-7b, I found that reusing cached engine is pretty slow, even slower than training a non-refittable engine from scratch. I figured...

bug

## Bug Description > require_full_compilation (bool): Require modules to be compiled end to end or return an error as opposed to returning a hybrid graph where operations that cannot be...

bug

## Bug Description The output shape of `aten::_convolution` no longer matches pytorch after the TensorRT 10 upgrade. I have noticed that the output shape is correct when I pass in...

bug