TensorRT engine execution return all 0 due to warning[ScatterND with reduction==‘none’ only guarantees to be correct if indices are not duplicated.]
Greetings, everyone.
Right now I am working on this project: GitHub - facebookresearch/sam2: The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model. I am using this guidance to export sam2.1 encoder[small] onnx model: GitHub - ibaiGorordo/ONNX-SAM2-Segment-Anything: Python scripts for the Segment Anythin 2 (SAM2) model in ONNX The exported onnx file can be successfully converted to TensorRT engine file; The hardware/software information is shown as following: [04/20/2025-17:31:16] [I] === Device Information === [04/20/2025-17:31:16] [I] Available Devices: [04/20/2025-17:31:16] [I] Device 0: “Orin” UUID: GPU-a00bb704-da56-555b-a79c-65a4e3662de8 [04/20/2025-17:31:16] [I] Selected Device: Orin [04/20/2025-17:31:16] [I] Selected Device ID: 0 [04/20/2025-17:31:16] [I] Selected Device UUID: GPU-a00bb704-da56-555b-a79c-65a4e3662de8 [04/20/2025-17:31:16] [I] Compute Capability: 8.7 [04/20/2025-17:31:16] [I] SMs: 16 [04/20/2025-17:31:16] [I] Device Global Memory: 62840 MiB [04/20/2025-17:31:16] [I] Shared Memory per SM: 164 KiB [04/20/2025-17:31:16] [I] Memory Bus Width: 256 bits (ECC disabled) [04/20/2025-17:31:16] [I] Application Compute Clock Rate: 1.3 GHz [04/20/2025-17:31:16] [I] Application Memory Clock Rate: 1.3 GHz [04/20/2025-17:31:16] [I] [04/20/2025-17:31:16] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at. [04/20/2025-17:31:16] [I] [04/20/2025-17:31:16] [I] TensorRT version: 10.7.0 [04/20/2025-17:31:16] [I] Loading standard plugins [04/20/2025-17:31:16] [I] [TRT] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 31, GPU 13403 (MiB) [04/20/2025-17:31:18] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +928, GPU +749, now: CPU 1002, GPU 14197 (MiB) [04/20/2025-17:31:18] [I] Start parsing network model.
Issue:
I have successfully loaded the engine file and feed it with data using Python; However, the console outputs when running the inference: 2025-04-20 17:50:14.100165426 [W:onnxruntime:Default, scatter_nd.h:51 ScatterNDWithAtomicReduction] ScatterND with reduction==‘none’ only guarantees to be correct if indices are not duplicated. 2025-04-20 17:50:14.100240944 [W:onnxruntime:Default, scatter_nd.h:51 ScatterNDWithAtomicReduction] ScatterND with reduction==‘none’ only guarantees to be correct if indices are not duplicated. Due to above issue, all values in all output tensors are 0. Online search doesn’t show too much helpful/valuable information to solve this issue. If you have any clue/idea about this, could you please share it with me? Thank you very much. :)
That warning comes from onnxruntime and not TensorRT. Could you try asking on the onnxruntime repo?
That warning comes from onnxruntime and not TensorRT. Could you try asking on the onnxruntime repo?
Hello there. I mean no offense but once the onnx file is convert to TensorRT engine file, my inference code has nothing related to onnxruntime. To be more specific, once I got the TensorRT engine file, I use another INDEPENDENT python script which doesn't include the onnxruntime package nor use any onnxruntime API. I feel confused why the warning still comes from the onnxruntime even it doesn't exist in my inference code.
TensorRT does not call onnxruntime. Could you share the script for running the engine?
If you have the model already you could also try running it with polygraphy run --onnxrt model.onnx and polygraphy run --trt model.onnx to run onnxruntime and TRT separately to compare.