efficientvit
efficientvit copied to clipboard
Unable to convert tensorrt SAM model
Hi, I am unable to export SAM decoder model to tensorrt. It failes with the following error -
[02/28/2024-12:37:37] [V] [TRT] Registering layer: /Tile for ONNX node: /Tile
[02/28/2024-12:37:37] [E] Error[4]: [graph.cpp::symbolicExecute::539] Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[02/28/2024-12:37:37] [E] [TRT] ModelImporter.cpp:771: While parsing node number 108 [Tile -> "/Tile_output_0"]:
[02/28/2024-12:37:37] [E] [TRT] ModelImporter.cpp:772: --- Begin node ---
[02/28/2024-12:37:37] [E] [TRT] ModelImporter.cpp:773: input: "/Unsqueeze_3_output_0"
input: "/Reshape_2_output_0"
output: "/Tile_output_0"
name: "/Tile"
op_type: "Tile"
[02/28/2024-12:37:37] [E] [TRT] ModelImporter.cpp:774: --- End node ---
[02/28/2024-12:37:37] [E] [TRT] ModelImporter.cpp:777: ERROR: ModelImporter.cpp:195 In function parseGraph:
[6] Invalid Node - /Tile
[graph.cpp::symbolicExecute::539] Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[02/28/2024-12:37:37] [E] Failed to parse onnx file
[02/28/2024-12:37:37] [I] Finished parsing network model. Parse time: 0.13611
[02/28/2024-12:37:37] [E] Parsing model failed
[02/28/2024-12:37:37] [E] Failed to create engine from model or file.
[02/28/2024-12:37:37] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8601] # /usr/src/tensorrt/bin/trtexec --onnx=assets/export_models/sam/onnx/xl1_decoder.onnx --minShapes=point_coords:1x1x2,point_labels:1x1 --optShapes=point_coords:16x2x2,point_labels:16x2 --maxShapes=point_coords:16x2x2,point_labels:16x2 --fp16 --saveEngine=assets/export_models/sam/tensorrt/xl1_decoder.engine --verbose
Environment:
torch 2.2.1
torchaudio 2.2.1
torchelastic 0.2.2
torchpack 0.3.1
torchprofile 0.0.4
torchvision 0.17.1
onnx 1.15.0
onnxruntime 1.17.1
onnxsim 0.4.35
ii libnvinfer-bin 8.6.1.6-1+cuda12.0 amd64 TensorRT binaries
ii libnvinfer-dev 8.6.1.6-1+cuda12.0 amd64 TensorRT development libraries
ii libnvinfer-dispatch-dev 8.6.1.6-1+cuda12.0 amd64 TensorRT development dispatch runtime libraries
ii libnvinfer-dispatch8 8.6.1.6-1+cuda12.0 amd64 TensorRT dispatch runtime library
ii libnvinfer-headers-dev 8.6.1.6-1+cuda12.0 amd64 TensorRT development headers
ii libnvinfer-headers-plugin-dev 8.6.1.6-1+cuda12.0 amd64 TensorRT plugin headers
ii libnvinfer-lean-dev 8.6.1.6-1+cuda12.0 amd64 TensorRT lean runtime libraries
ii libnvinfer-lean8 8.6.1.6-1+cuda12.0 amd64 TensorRT lean runtime library
ii libnvinfer-plugin-dev 8.6.1.6-1+cuda12.0 amd64 TensorRT plugin libraries
ii libnvinfer-plugin8 8.6.1.6-1+cuda12.0 amd64 TensorRT plugin libraries
ii libnvinfer-samples 8.6.1.6-1+cuda12.0 all TensorRT samples
ii libnvinfer-vc-plugin-dev 8.6.1.6-1+cuda12.0 amd64 TensorRT vc-plugin library
ii libnvinfer-vc-plugin8 8.6.1.6-1+cuda12.0 amd64 TensorRT vc-plugin library
ii libnvinfer8 8.6.1.6-1+cuda12.0 amd64 TensorRT runtime libraries
I encountered such a mistake
[02/29/2024-10:13:33] [E] [TRT] ModelImporter.cpp:751: --- End node --- [02/29/2024-10:13:33] [E] [TRT] ModelImporter.cpp:754: ERROR: ModelImporter.cpp:179 In function parseGraph: [6] Invalid Node - /image_encoder/backbone/stages.4/op_list.1/context_module/main/Pad [shuffleNode.cpp::symbolicExecute::391] Error Code 4: Internal Error (/image_encoder/backbone/stages.4/op_list.1/context_module/main/Reshape_1: IShuffleLayer applied to shape tensor must have 0 or 1 reshape dimensions: dimensions were [-1,2]) [02/29/2024-10:13:33] [E] Failed to parse onnx file [02/29/2024-10:13:33] [I] Finish parsing network model [02/29/2024-10:13:33] [E] Parsing model failed [02/29/2024-10:13:33] [E] Failed to create engine from model. [02/29/2024-10:13:33] [E] Engine set up failed &&&& FAILED TensorRT.trtexec [TensorRT v8400] # trtexec --onnx=assets/export_models/sam/onnx/xl1_encoder.onnx --minShapes=input_image:1x3x1024x1024 --optShapes=input_image:4x3x1024x1024 --maxShapes=input_image:4x3x1024x1024 --saveEngine=assets/export_models/sam/tensorrt/xl1_encoder.engine
I had a similar issue and I solved it by replacing the problematic functions that use torch.repeat_interleave
in the original SAM repo with monkey patch, see #78
Thanks for your solution! I encountered the same problem under TensorRT 10.0.1.6 / 8.6.3. It's worth noting that I tried on TensorRT 8.6.1, and trt did not report any conflict. I am not sure whether they changed the way of the parsing of some nodes.