TensorRT TensorRT engine execution return all 0 due to warning[ScatterND with reduction==‘none’ only guarantees to be correct if indices are not duplicated.]

Greetings, everyone.

Right now I am working on this project: GitHub - facebookresearch/sam2: The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model. I am using this guidance to export sam2.1 encoder[small] onnx model: GitHub - ibaiGorordo/ONNX-SAM2-Segment-Anything: Python scripts for the Segment Anythin 2 (SAM2) model in ONNX The exported onnx file can be successfully converted to TensorRT engine file; The hardware/software information is shown as following: [04/20/2025-17:31:16] [I] === Device Information === [04/20/2025-17:31:16] [I] Available Devices: [04/20/2025-17:31:16] [I] Device 0: “Orin” UUID: GPU-a00bb704-da56-555b-a79c-65a4e3662de8 [04/20/2025-17:31:16] [I] Selected Device: Orin [04/20/2025-17:31:16] [I] Selected Device ID: 0 [04/20/2025-17:31:16] [I] Selected Device UUID: GPU-a00bb704-da56-555b-a79c-65a4e3662de8 [04/20/2025-17:31:16] [I] Compute Capability: 8.7 [04/20/2025-17:31:16] [I] SMs: 16 [04/20/2025-17:31:16] [I] Device Global Memory: 62840 MiB [04/20/2025-17:31:16] [I] Shared Memory per SM: 164 KiB [04/20/2025-17:31:16] [I] Memory Bus Width: 256 bits (ECC disabled) [04/20/2025-17:31:16] [I] Application Compute Clock Rate: 1.3 GHz [04/20/2025-17:31:16] [I] Application Memory Clock Rate: 1.3 GHz [04/20/2025-17:31:16] [I] [04/20/2025-17:31:16] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at. [04/20/2025-17:31:16] [I] [04/20/2025-17:31:16] [I] TensorRT version: 10.7.0 [04/20/2025-17:31:16] [I] Loading standard plugins [04/20/2025-17:31:16] [I] [TRT] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 31, GPU 13403 (MiB) [04/20/2025-17:31:18] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +928, GPU +749, now: CPU 1002, GPU 14197 (MiB) [04/20/2025-17:31:18] [I] Start parsing network model.

Issue:

I have successfully loaded the engine file and feed it with data using Python; However, the console outputs when running the inference: 2025-04-20 17:50:14.100165426 [W:onnxruntime:Default, scatter_nd.h:51 ScatterNDWithAtomicReduction] ScatterND with reduction==‘none’ only guarantees to be correct if indices are not duplicated. 2025-04-20 17:50:14.100240944 [W:onnxruntime:Default, scatter_nd.h:51 ScatterNDWithAtomicReduction] ScatterND with reduction==‘none’ only guarantees to be correct if indices are not duplicated. Due to above issue, all values in all output tensors are 0. Online search doesn’t show too much helpful/valuable information to solve this issue. If you have any clue/idea about this, could you please share it with me? Thank you very much. :)

Apr 21 '25 08:04 sdecoder

That warning comes from onnxruntime and not TensorRT. Could you try asking on the onnxruntime repo?

Apr 24 '25 18:04 yuanyao-nv

That warning comes from onnxruntime and not TensorRT. Could you try asking on the onnxruntime repo?

Hello there. I mean no offense but once the onnx file is convert to TensorRT engine file, my inference code has nothing related to onnxruntime. To be more specific, once I got the TensorRT engine file, I use another INDEPENDENT python script which doesn't include the onnxruntime package nor use any onnxruntime API. I feel confused why the warning still comes from the onnxruntime even it doesn't exist in my inference code.

Apr 25 '25 15:04 sdecoder

TensorRT does not call onnxruntime. Could you share the script for running the engine?

Apr 25 '25 17:04 yuanyao-nv

If you have the model already you could also try running it with polygraphy run --onnxrt model.onnx and polygraphy run --trt model.onnx to run onnxruntime and TRT separately to compare.

Apr 25 '25 17:04 yuanyao-nv