TensorRT `ScatterElements` with Reduction (opset 16) Not Fully Supported

Short Description

Conversion of an ONNX model to TensorRT using trtexec, which includes a scatterElements operation with a reduction like "sum" (opset 16), fails when the number of indices in the operation exceeds the output count.

Successful conversion requires n_indices <= n_outputs.

Long Description

Consider the following PyTorch model snippet:

import torch
import torch_scatter

n_indices: int = ...
dim_size: int = ...
n_outputs: int = ...

e_dummy = torch.randn(size=(n_indices, dim_size), device=device)
index_dummy = torch.randint(high=n_outputs, size=(n_indices,), device=device)

class ScatterModule(torch.nn.Module):
    def forward(self, e: torch.Tensor, index: torch.Tensor) -> torch.Tensor:
        return torch_scatter.scatter(
            src=e,
            # broadcasting (should be done automatically anyway)
            index=index.unsqueeze(-1).expand(-1, e.shape[1]),
            dim=0,
            reduce="sum",
        )

Converting this corrresponding ONNX model using trtexec triggers an assertion error:

Assertion failed: indicesDims.d[i] <= dataDims.d[i] && "Indices dimensions must be less than data dimensions!"

This error likely originates from this line of the ONNX-TensorRT code.

In the scenarios I've encountered within Graph Neural Networks, the number of indices (n_indices, corresponding to the edges in the graph) is significantly larger than the number of outputs (n_outputs, corresponding to the nodes in the graph).

Environment

TensorRT Version: 8.6.1.6-1+cuda11.8. I've also tried the TensorRT release 9.2. GPU Type: NVIDIA RTX A2000 (laptop) Nvidia Driver Version: 520.61.05 CUDA Version: 11.8.0-1 CUDNN Version: 8.7.0.84-1+cuda11.8 Operating System + Version: Ubuntu 22.04.1 LTS PyTorch version: 2.1.2

Relevant Files

I've created a repository to reproduce the issue: anthony-correia/scatter_onnx2tensorrt.

The ONNX models are stored with the naming convention onnx/{n_indices}_{dim_size}_{n_outputs}_{seed}.onnx.

To replicate the issue, execute the following commands:

# This command fails when `n_outputs = 100` and `n_indices = 1000`.
trtexec --onnx="onnx/1000_3_100_0.onnx"

# This command succeeds when `n_outputs` equals `n_indices` (both are 100).
trtexec --onnx="onnx/100_3_100_0.onnx"

Cross-posted in https://github.com/onnx/onnx-tensorrt/issues/953 ; this repository looks more relevant.

Feb 01 '24 14:02 anthony-correia

@ttyio Is it in our plan? Thanks!

Feb 07 '24 09:02 zerollzeng

FYI, we will have scatterElements plugin with reduction support in 10.0.

Mar 26 '24 17:03 ttyio

Great to hear that! Thanks a lot for letting me/us know!

Mar 26 '24 17:03 anthony-correia

later. now do you have method to

FYI, we will have scatterElements plugin with reduction support in 10.0.

when?

Mar 29 '24 06:03 Liupei1101

Now https://github.com/NVIDIA/TensorRT/blob/release/10.0/CHANGELOG.md

Parser changes
Added a new class IParserRefitter that can be used to refit a TensorRT engine with the weights of an ONNX model.
kNATIVE_INSTANCENORM is now set to ON by default.
Added support for IPluginV3 interfaces from TensorRT.
Added support for INT4 quantization.
Added support for the reduction attribute in ScatterElements.
Added support for wrap padding mode in Pad

Apr 04 '24 02:04 lix19937

Closing, thanks all!

Apr 16 '24 15:04 ttyio

TensorRT TensorRT copied to clipboard

`ScatterElements` with Reduction (opset 16) Not Fully Supported

Short Description

Long Description

Environment

Relevant Files

TensorRT
TensorRT copied to clipboard