TensorRT
TensorRT copied to clipboard
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
# Description Implement SDPA converter using TRT MHA API Fixes # (issue) ## Type of change Please delete options that are not relevant and/or add your own. - Bug fix...
Evaluate the performance of FLUX when using the resource paritioner
**Is your feature request related to a problem? Please describe.** We are seeing that Torchbind operators from the C++ runtime getting called into Python in order to dispatch. **Describe the...
# Description Add Groot N1.5-3B compilation example ## Type of change Please delete options that are not relevant and/or add your own. - Bug fix (non-breaking change which fixes an...
**Is your feature request related to a problem? Please describe.** **Describe the solution you'd like** **Describe alternatives you've considered** **Additional context**
**Is your feature request related to a problem? Please describe.** **Describe the solution you'd like** **Describe alternatives you've considered** **Additional context**
# Description Fixes #3934 ## Type of change - Bug fix (non-breaking change which fixes an issue) # Checklist: - [x] My code follows the style guidelines of this project...
## Bug Description engine cache test failed: FAILED models/test_engine_cache.py::TestEngineCache::test_dynamo_compile_with_custom_engine_cache FAILED models/test_engine_cache.py::TestEngineCache::test_torch_compile_graph_break FAILED models/test_engine_cache.py::TestEngineCache::test_torch_compile_with_custom_engine_cache https://gitlab-master.nvidia.com/dl/dgx/pytorch/-/jobs/234818816 https://gitlab-master.nvidia.com/dl/dgx/pytorch/-/jobs/234818814 ## To Reproduce It can be reproduced in local Linux work station ## Expected behavior...
## TL;DR ## Goal(s) ## Tasks ```[tasklist] ### Tasks ``` ## Additional context
🐛 [Bug] nn.MultiheadAttention fails with Torch-TensorRT due to non-contiguous tensor before view()
## Bug Description When compiling a simple nn.MultiheadAttention module with Torch-TensorRT using the dynamo IR, I get a runtime error related to view() because the tensor returned by scaled_dot_product_attention is...