TensorRT
TensorRT copied to clipboard
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
# Description Don't merge, just want to show the change Fixes # (issue) ## Type of change Please delete options that are not relevant and/or add your own. - Bug...
# Description Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change....
# Description Convert some potential TensorRT errors (returned boolean is false) into proper exceptions, at runtime. Fixes #2367 ## Type of change - Bug fix (non-breaking change which fixes an...
## Bug Description //tests/core/conversion/converters:test_scaled_dot_product_attention (C++ exception with description "0 INTERNAL ASSERT FAILED at "../torch/csrc/jit/ir/alias_analysis.cpp":615, please report a bug to PyTorch. We don't have an op for aten::scaled_dot_product_attention but it isn't...
## Bug Description Observing threshold failures. Test passes occasionally. FAILED lowering/test_aten_lowering_passes.py::TestLowerLinear::test_lower_linear - AssertionError: 0.00113677978515625 != 0 within 4 places (0.00113677978515625 difference) : Linear TRT outputs don't match with the original...
## TL;DR operation converters in dynamo to support full compilation for GPT2 ## Goal(s) Run GPT2 on multi-gpu with only 1 TensorRT engine. ## Tasks ```[tasklist] ### Tasks - [...
## Bug Description When running the compiled LSTM model for half dtype with torch-tensorrt, I get this errors: `RuntimeError: Input and parameter tensors are not the same dtype, found input...
## Bug Description TensorRT throws error about fp32 tensors input despite I am using fp16 tensors as input. I attached the file `IFRNet.py` adapted from [https://github.com/ltkong218/IFRNet/blob/main/models/IFRNet.py](url) ## To Reproduce Steps...
**Is your feature request related to a problem? Please describe.** Our current workflow ```py ep = torch.export.export(model, (inputs,)) trt_gm = torch_tensorrt.dynamo.compile(ep, inputs=[inputs]) torch_tensorrt.save(trt_gm, "trt.ep", inputs=[inputs]) ``` Desired workflow: ```py ep...
## Bug Description ## To Reproduce Steps to reproduce the behavior: input_data = torch.rand([1, 3, 1280, 720]).cuda(device) print(type(input_data)) # input_data = input_data.to(device) # Trace the module with example data traced_model...