TensorRT
TensorRT copied to clipboard
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
# Description As I requested, TensorRT 10.14 added an argument `trt.SerializationFlag.INCLUDE_REFIT` to allow refitted engines to keep refittable. That means engines can be refitted multiple times. Based on the capability,...
# Description Here is the CI pipeline: https://github.com/pytorch/TensorRT/actions/runs/20152165935/job/57847168432 Here is the auto commit record: https://github.com/pytorch/TensorRT/commit/ab76c1db4d6c91c56956308a1db2f7ce37fa7fad pytorch upgraded the cuda from [13.0.0 to 13.0.2](https://github.com/pytorch/pytorch/commit/544b443ea1d1a9b19e65f981168a01cb87a2d333) which upgraded the nvidia-cuda-runtime==13.0.96 however tensorrt_cu13 has...
## Bug Description ```py from tensorrt import Logger, Runtime from torch import randn from torchvision.models import mobilenet_v2, MobileNet_V2_Weights from torch_tensorrt import convert_method_to_trt_engine # Create model weights = MobileNet_V2_Weights.DEFAULT model =...
# Description Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change....
# Description Adds a testcase to cover #3775 ## Type of change Please delete options that are not relevant and/or add your own. - Bug fix (non-breaking change which fixes...
**Is your feature request related to a problem? Please describe.** **Describe the solution you'd like** An implementation of the engine cache that is concurrency aware. Should spin lock repeated requests...
Need to add reduction targets in Autocast DepthOfReductionRule
# Description Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change....
## Bug Description ``` from contextlib import nullcontext import torch import torch.nn as nn import torch.nn.functional as F import torch_tensorrt class SampleNetwork(nn.Module): def __init__( self, num_attention_heads: int, ) -> None:...
This PR 1. Adds rank based logging for the distributed examples 2. Corrects the fallback to pytorch case for NCCL converters 3. This with #3830 provides utilities for running distributed...