TensorRT issues

feat: Add support for TRT IAttention API

# Description Implement SDPA converter using TRT MHA API Fixes # (issue) ## Type of change Please delete options that are not relevant and/or add your own. - Bug fix...

peri044

cla signed

Utilize Resource partitioner to run constrained models

Evaluate the performance of FLUX when using the resource paritioner

narendasan

✨[Feature] Reducing Overhead with C++ Torchbind operation getting called up to Python

**Is your feature request related to a problem? Please describe.** We are seeing that Torchbind operators from the C++ runtime getting called into Python in order to dispatch. **Describe the...

narendasan

feature request

chore: Add Groot example

# Description Add Groot N1.5-3B compilation example ## Type of change Please delete options that are not relevant and/or add your own. - Bug fix (non-breaking change which fixes an...

peri044

documentation

cla signed

✨[Feature] Enable refitting on resource-partitioned graphs

**Is your feature request related to a problem? Please describe.** **Describe the solution you'd like** **Describe alternatives you've considered** **Additional context**

cehongwang

feature request

✨[Feature] nn.module-defined atomic subgraph

**Is your feature request related to a problem? Please describe.** **Describe the solution you'd like** **Describe alternatives you've considered** **Additional context**

cehongwang

feature request

Allow Model Export Test Parallelism

2

# Description Fixes #3934 ## Type of change - Bug fix (non-breaking change which fixes an issue) # Checklist: - [x] My code follows the style guidelines of this project...

leimao

cla signed

🐛 [Bug] Engine cache failed on torch.compile backend=tensorrt

2

## Bug Description engine cache test failed: FAILED models/test_engine_cache.py::TestEngineCache::test_dynamo_compile_with_custom_engine_cache FAILED models/test_engine_cache.py::TestEngineCache::test_torch_compile_graph_break FAILED models/test_engine_cache.py::TestEngineCache::test_torch_compile_with_custom_engine_cache https://gitlab-master.nvidia.com/dl/dgx/pytorch/-/jobs/234818816 https://gitlab-master.nvidia.com/dl/dgx/pytorch/-/jobs/234818814 ## To Reproduce It can be reproduced in local Linux work station ## Expected behavior...

lanluo-nvidia

bug

Convert torch.cond

## TL;DR ## Goal(s) ## Tasks ```[tasklist] ### Tasks ``` ## Additional context

sbhendigeri

feature request

🐛 [Bug] nn.MultiheadAttention fails with Torch-TensorRT due to non-contiguous tensor before view()

1

## Bug Description When compiling a simple nn.MultiheadAttention module with Torch-TensorRT using the dynamo IR, I get a runtime error related to view() because the tensor returned by scaled_dot_product_attention is...

LinzhouLi

bug

TensorRT
TensorRT copied to clipboard

Metadata

feat: Add support for TRT IAttention API

Utilize Resource partitioner to run constrained models

✨[Feature] Reducing Overhead with C++ Torchbind operation getting called up to Python

chore: Add Groot example

✨[Feature] Enable refitting on resource-partitioned graphs

✨[Feature] nn.module-defined atomic subgraph

Allow Model Export Test Parallelism

🐛 [Bug] Engine cache failed on torch.compile backend=tensorrt

Convert torch.cond

🐛 [Bug] nn.MultiheadAttention fails with Torch-TensorRT due to non-contiguous tensor before view()

← Metadata

Owner

Metadata

TensorRT TensorRT copied to clipboard

Metadata

← Metadata

Owner

Metadata

TensorRT
TensorRT copied to clipboard