TensorRT "This version of TensorRT does not support dynamic axes." failure of TensorRT 8.5.2 and 8.6.1 when running groundingdino.onnx on GPU Tesla V100 and Tesla T4

Description

I tried to convert grouding.onnx to tensorrt on GPU, but it fails with the error below torch2onnx commend:

            caption =  "the running dog ." #". ".join(input_text)
            input_ids =  model.tokenizer([caption], return_tensors="pt")["input_ids"]
            position_ids = torch.tensor([[0, 0, 1, 2, 3, 0]])
            token_type_ids = torch.tensor([[0, 0, 0, 0, 0, 0]])
            attention_mask = torch.tensor([[True, True, True, True, True, True]])
            text_token_mask = torch.tensor([[[ True, False, False, False, False, False],
                [False,  True,  True,  True,  True, False],
                [False,  True,  True,  True,  True, False],
                [False,  True,  True,  True,  True, False],
                [False,  True,  True,  True,  True, False],
                [False, False, False, False, False,  True]]])
            
            img = torch.randn(1, 3, 512, 512)
            # img = image[None]

            dynamic_axes={
            "input_ids": {0: "batch_size", 1: "seq_len"},
            "attention_mask": {0: "batch_size", 1: "seq_len"},
            "position_ids": {0: "batch_size", 1: "seq_len"},
            "token_type_ids": {0: "batch_size", 1: "seq_len"},
            "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"},       
            "img": {0: "batch_size", 2: "height", 3: "width"},
            "logits": {0: "batch_size"},
            "boxes": {0: "batch_size"}
            }
            # output_model = model(img, input_ids, attention_mask, position_ids, token_type_ids, text_token_mask)
            # print(f'output_model:{output_model}')
            # outputs = output_model
            onnx_path = "/GroundingDINO/weights/gd_token_dynamic_sigmoid_512_fold.onnx"
            torch.onnx.export(
                            model,
                            f=onnx_path,
                            args=(img, input_ids, attention_mask, position_ids, token_type_ids, text_token_mask), #, zeros, ones),
                            input_names=["img" , "input_ids", "attention_mask", "position_ids", "token_type_ids", "text_token_mask"],
                            output_names=["logits", "boxes"],
                            do_constant_folding=True, 
                            dynamic_axes=dynamic_axes,
                            opset_version=16)

( error log).

[12/15/2023-10:39:35] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[12/15/2023-10:39:36] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[12/15/2023-10:39:37] [E] [TRT] ModelImporter.cpp:726: While parsing node number 3432 [Slice -> "onnx::Slice_3626"]:
[12/15/2023-10:39:37] [E] [TRT] ModelImporter.cpp:727: --- Begin node ---
[12/15/2023-10:39:37] [E] [TRT] ModelImporter.cpp:728: input: "onnx::Slice_3616"
input: "onnx::Slice_22788"
input: "onnx::Slice_3622"
input: "onnx::Slice_22789"
input: "onnx::Slice_3625"
output: "onnx::Slice_3626"
name: "Slice_3432"
op_type: "Slice"

[12/15/2023-10:39:37] [E] [TRT] ModelImporter.cpp:729: --- End node ---
[12/15/2023-10:39:37] [E] [TRT] ModelImporter.cpp:732: ERROR: builtin_op_importers.cpp:4531 In function importSlice:
[8] Assertion failed: (axes.allValuesKnown()) && "This version of TensorRT does not support dynamic axes."
[12/15/2023-10:39:37] [E] Failed to parse onnx file
[12/15/2023-10:39:37] [I] Finish parsing network model
[12/15/2023-10:39:37] [E] Parsing model failed
[12/15/2023-10:39:37] [E] Failed to create engine from model or file.
[12/15/2023-10:39:37] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8503] # trtexec --onnx=grounded_1207.onnx --saveEngine=grounded_1207.plan

Environment

TensorRT Version:8.5.3

NVIDIA GPU:Tesla T4

NVIDIA Driver Version:470.103.01

CUDA Version:11.4

CUDNN Version:

Environment2（I tried two Environment2 but got the same error）

TensorRT Version:8.6.1

NVIDIA GPU:Tesla V100

NVIDIA Driver Version:470.103.01

CUDA Version:11.4

CUDNN Version:

Operating System:

Python Version (if applicable): 3.8

PyTorch Version (if applicable):1.12.0

Relevant Files

Model link: https://github.com/IDEA-Research/GroundingDINO/issues/46

Steps To Reproduce

Commands or scripts:

Have you tried the latest release?: Yes, I tried tensorrt 8.6.1. but got the same error

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): Yes, I run the onnx model with ONNXRuntime and got same results with ckpt'results

Dec 15 '23 06:12 Qia98

And I use this command try sanitise it:

./polygraphy surgeon sanitize --fold-constants ./input.onnx  -o output.onnx

result is

[I] RUNNING | Command: /miniconda3/envs/tensorrt/bin/polygraphy surgeon sanitize gd_token_no_dy_sigmoid_512_fold.onnx --fold-constants -o gd_token_no_dy_sigmoid_512_fold_fold2.onnx
[I] Inferring shapes in the model with `onnxruntime.tools.symbolic_shape_infer`.
    Note: To force Polygraphy to use `onnx.shape_inference` instead, set `allow_onnxruntime=False` or use the `--no-onnxruntime-shape-inference` command-line option.
[I] Loading model: /groundingdino/onnx/gd_token_no_dy_sigmoid_512_fold.onnx
[W] ONNX shape inference exited with an error:
[I] Loading model: //groundingdino/onnx/gd_token_no_dy_sigmoid_512_fold.onnx
[I] Original Model:
    Name: torch_jit | ONNX Opset: 16
    
    ---- 6 Graph Input(s) ----
    {img [dtype=float32, shape=(1, 3, 512, 512)],
     input_ids [dtype=int64, shape=(1, 6)],
     attention_mask [dtype=bool, shape=(1, 6)],
     position_ids [dtype=int64, shape=(1, 6)],
     token_type_ids [dtype=int64, shape=(1, 6)],
     text_token_mask [dtype=bool, shape=(1, 6, 6)]}
    
    ---- 2 Graph Output(s) ----
    {logits [dtype=float32, shape=('Gatherlogits_dim_0', 'Gatherlogits_dim_1')],
     boxes [dtype=float32, shape=('Gatherboxes_dim_0', 'Gatherboxes_dim_1')]}
    
    ---- 998 Initializer(s) ----
    
    ---- 17280 Node(s) ----
    
[I] Folding Constants | Pass 1
2023-12-14 16:07:30.631822594 [E:onnxruntime:Default, env.cc:231 ThreadMain] pthread_setaffinity_np failed for thread: 140684484028160, mask: 1, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
2023-12-14 16:07:30.631813344 [E:onnxruntime:Default, env.cc:231 ThreadMain] pthread_setaffinity_np failed for thread: 140684475635456, mask: 2, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
2023-12-14 16:07:30.631950261 [E:onnxruntime:Default, env.cc:231 ThreadMain] pthread_setaffinity_np failed for thread: 140684333025024, mask: 3, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
2023-12-14 16:07:30.631980469 [E:onnxruntime:Default, env.cc:231 ThreadMain] pthread_setaffinity_np failed for thread: 140684467242752, mask: 4, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Failed to load model with error: /onnxruntime_src/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 9, max supported IR version: 8
[W] ONNX shape inference exited with an error:
[I]     Total Nodes | Original: 17280, After Folding: 11755 |  5525 Nodes Folded
[I] Folding Constants | Pass 2
2023-12-14 16:07:51.785005196 [E:onnxruntime:Default, env.cc:231 ThreadMain] pthread_setaffinity_np failed for thread: 140684467242752, mask: 1, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
2023-12-14 16:07:51.785046154 [E:onnxruntime:Default, env.cc:231 ThreadMain] pthread_setaffinity_np failed for thread: 140684333025024, mask: 2, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
2023-12-14 16:07:51.785046154 [E:onnxruntime:Default, env.cc:231 ThreadMain] pthread_setaffinity_np failed for thread: 140684484028160, mask: 4, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
2023-12-14 16:07:51.785059196 [E:onnxruntime:Default, env.cc:231 ThreadMain] pthread_setaffinity_np failed for thread: 140684475635456, mask: 3, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
[W] Inference failed. You may want to try enabling partitioning to see better results. Note: Error was:
[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Failed to load model with error: /onnxruntime_src/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 9, max supported IR version: 8
[W] ONNX shape inference exited with an error:
[I]     Total Nodes | Original: 11755, After Folding: 11755 |     0 Nodes Folded
[W] ONNX shape inference exited with an error:
[I] Saving ONNX model to: gd_token_no_dy_sigmoid_512_fold_fold2.onnx
[I] New Model:
    Name: torch_jit | ONNX Opset: 16
    
    ---- 6 Graph Input(s) ----
    {img [dtype=float32, shape=(1, 3, 512, 512)],
     input_ids [dtype=int64, shape=(1, 6)],
     attention_mask [dtype=bool, shape=(1, 6)],
     position_ids [dtype=int64, shape=(1, 6)],
     token_type_ids [dtype=int64, shape=(1, 6)],
     text_token_mask [dtype=bool, shape=(1, 6, 6)]}
    
    ---- 2 Graph Output(s) ----
    {logits [dtype=float32, shape=('Gatherlogits_dim_0', 'Gatherlogits_dim_1')],
     boxes [dtype=float32, shape=('Gatherboxes_dim_0', 'Gatherboxes_dim_1')]}
    
    ---- 6523 Initializer(s) ----
    
    ---- 11755 Node(s) ----
    
[I] PASSED | Runtime: 89.214s | Command: /miniconda3/envs/tensorrt/bin/polygraphy surgeon sanitize gd_token_no_dy_sigmoid_512_fold.onnx --fold-constants -o gd_token_no_dy_sigmoid_512_fold_fold2.onnx

Then I try to convert the folded onnx to trt, but got the same error with non-folded onnx

Dec 15 '23 06:12 Qia98

It's a known limitation.

You have 2 solution:

if you don't need dynamic shapes, export the onnx with static shape and constant folding to eliminate the dynamic axes.
Try to fix it in the pytorch/TF source code, although this may be still impossible.