TensorRT Error Code 4: Internal Error (Conv_1840: number of kernel weights does not match tensor dimensions)

Description

I found out that the required weight count is twice as in the onnx model, but it's not clear how to fix this error

[08/17/2022-17:06:41] [TRT] [E] Conv_1840: count of 1474560 weights in kernel, but kernel dimensions (3,3) with 320 input channels, 512 output channels and 2 groups were specified. Expected Weights count is 320 * 3*3 * 512 / 2 = 737280
[08/17/2022-17:06:42] [TRT] [E] [convolutionNode.cpp::computeOutputExtents::43] Error Code 4: Internal Error (Conv_1840: number of kernel weights does not match tensor dimensions)
[08/17/2022-17:06:42] [TRT] [V] Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 512
[08/17/2022-17:06:42] [TRT] [V] Convolution output dimensions: ()
ERROR: Failed to parse the ONNX file: end2end.onnx
ERROR: Failed to parse the ONNX file.
got 1 errors: 
In node 1840 (parseGraph): INVALID_NODE: Invalid Node - Conv_1840
Conv_1840:kernel weights has count 1474560 but 737280 was expected
Conv_1840: count of 1474560 weights in kernel, but kernel dimensions (3,3) with 320 input channels, 512 output channels and 2 groups were specified. Expected Weights count is 320 * 3*3 * 512 / 2 = 737280
[convolutionNode.cpp::computeOutputExtents::43] Error Code 4: Internal Error (Conv_1840: number of kernel weights does not match tensor dimensions)

Environment

TensorRT Version: NVIDIA GPU: cu113 CUDA Version: 8.2 Operating System: ubuntu20.04 Python Version (if applicable): 3.7.13 PyTorch Version (if applicable): 1.12+cu113

Relevant Files

https://drive.google.com/drive/folders/1X2_wBV4DOykZ5eQovbYGARDUsg0ZPxwX?usp=sharing

Steps To Reproduce

Aug 17 '22 09:08 980202006

Maybe a bug, what does your input dimension looks like?

Aug 17 '22 10:08 zerollzeng

error node:

Aug 17 '22 10:08 zerollzeng

Where can I get this visualizer?Input likes [1,6,3,720,1296]

Aug 17 '22 10:08 980202006

https://netron.app/

Aug 17 '22 12:08 zerollzeng

@zerollzeng I observed that there is a parameter max_workspace_size, which may be the largest batch size when exporting the model. What determines max_workspace_size? Will fp16 cause max_workspace_size to become smaller?

Aug 18 '22 06:08 980202006

@zerollzeng Is there a way to map the problematic operator in onnx to the torch model code?

Aug 18 '22 06:08 980202006

@zerollzeng Is there a way to map the problematic operator in onnx to the torch model code?

I tried to find the answer before but failed finally :-( so I don't think its possible, and the exported node name will change across different Pytorch versions AFAIK.

Aug 18 '22 12:08 zerollzeng

@zerollzeng I observed that there is a parameter max_workspace_size, which may be the largest batch size when exporting the model. What determines max_workspace_size? Will fp16 cause max_workspace_size to become smaller?

Yes, but since 8.4 you don't need to worry about the workspace size. we set it to max by default.

Aug 18 '22 12:08 zerollzeng

Thank you, I encountered another problem here, do you have any ideas on this problem?

Process Process-3:
Traceback (most recent call last):
  File "/root/miniconda3/envs/mmdeploy/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/root/miniconda3/envs/mmdeploy/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "/home/mmdeploy/mmdeploy/backend/tensorrt/onnx2tensorrt.py", line 88, in onnx2tensorrt
    device_id=device_id)
  File "/home/mmdeploy/mmdeploy/backend/tensorrt/utils.py", line 113, in from_onnx
    raise RuntimeError(f'Failed to parse onnx, {error_msgs}')
RuntimeError: Failed to parse onnx, In node 4622 (addScatterLayer): UNSUPPORTED_NODE: Assertion failed: indicesDims.d[i] <= dataDims.d[i] && "Indices dimensions must be less than data dimensions!"

Aug 19 '22 02:08 980202006

@zerollzeng

Aug 19 '22 02:08 980202006

the error is raise in here: https://github.com/onnx/onnx-tensorrt/blob/1da7332349d5b1196ccfa6dc719b839876f1e83e/onnx2trt_utils.cpp#L2265 it's happened during parse the onnx, you can check the node 4622 in you onnx model. or share it here so that I can take a look

Aug 20 '22 11:08 zerollzeng

https://drive.google.com/file/d/1XJ86EWnUmdHEOMgYCsQHs9ESJmdlgbIW/view?usp=sharing

[08/22/2022-10:44:21] [TRT] [V] Graph construction and optimization completed in 28.9074 seconds.
[08/22/2022-10:44:22] [TRT] [V] Using cublasLt as a tactic source
[08/22/2022-10:44:22] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +485, GPU +206, now: CPU 1411, GPU 514 (MiB)
[08/22/2022-10:44:22] [TRT] [V] Using cuDNN as a tactic source
[08/22/2022-10:44:23] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +468, GPU +204, now: CPU 1879, GPU 718 (MiB)
[08/22/2022-10:44:23] [TRT] [W] TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.2.4
[08/22/2022-10:44:23] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[08/22/2022-10:44:23] [TRT] [V] Constructing optimization profile number 0 [1/1].
[08/22/2022-10:44:23] [TRT] [E] 4: [shapeCompiler.cpp::evaluateShapeChecks::911] Error Code 4: Internal Error (kOPT values for profile 0 violate shape constraints: reshape would change volume. IShuffleLayer Reshape_4296: reshaping failed for tensor: onnx::Reshape_5167)
Traceback (most recent call last):
  File "/root/miniconda3/envs/mmdeploy/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/root/miniconda3/envs/mmdeploy/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
    cli.main()
  File "/root/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
    run()
  File "/root/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
    runpy.run_path(target, run_name="__main__")
  File "/root/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 322, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/root/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 136, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/root/.vscode-server/extensions/ms-python.python-2022.12.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
    exec(code, run_globals)
  File "/home/mmdeploy/to_fp16.py", line 274, in <module>
    build_engine_onnx(onnx_model_file)
  File "/home/mmdeploy/to_fp16.py", line 198, in build_engine_onnx
    with builder.build_engine(network, config) as engine, open(args.engine_file, "wb") as f:
AttributeError: __enter__
(base) root@ecs-0:/home/mmdeploy# conda activate mmdeploy

Aug 22 '22 03:08 980202006

@zerollzeng Thanks, here is my onnx file.

Aug 22 '22 09:08 980202006

Do you use dynamic shape? looks like your model doesn't support dynamic shape or you input dimension is invalid:

[E] 4: [shapeCompiler.cpp::evaluateShapeChecks::911] Error Code 4: Internal Error (kOPT values for profile 0 violate shape constraints: reshape would change volume. IShuffleLayer Reshape_4296: reshaping failed for tensor: onnx::Reshape_5167)

Aug 22 '22 15:08 zerollzeng

I can't reproduce your error on my side because your model contains your own plugin:

[08/22/2022-15:37:14] [I] [TRT] No importer registered for op: grid_sampler. Attempting to import as plugin.
[08/22/2022-15:37:14] [I] [TRT] Searching for plugin: grid_sampler, plugin_version: 1, plugin_namespace:
[08/22/2022-15:37:14] [E] [TRT] parsers/onnx/ModelImporter.cpp:773: While parsing node number 292 [grid_sampler -> "onnx::Concat_561"]:
[08/22/2022-15:37:14] [E] [TRT] parsers/onnx/ModelImporter.cpp:774: --- Begin node ---
[08/22/2022-15:37:14] [E] [TRT] parsers/onnx/ModelImporter.cpp:775: input: "x.19"
input: "grid_flow"
output: "onnx::Concat_561"
name: "grid_sampler_292"
op_type: "grid_sampler"
attribute {
  name: "align_corners"
  i: 1
  type: INT
}
attribute {
  name: "interpolation_mode"
  i: 0
  type: INT
}
attribute {
  name: "padding_mode"
  i: 1
  type: INT
}
domain: "mmdeploy"

[08/22/2022-15:37:14] [E] [TRT] parsers/onnx/ModelImporter.cpp:776: --- End node ---
[08/22/2022-15:37:14] [E] [TRT] parsers/onnx/ModelImporter.cpp:778: ERROR: parsers/onnx/builtin_op_importers.cpp:4890 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

Aug 22 '22 15:08 zerollzeng

My command using trtexec:

&&&& FAILED TensorRT.trtexec [TensorRT v8401] # trtexec --onnx=end2end_new.onnx --optShapes=input:1x3x720x1296

Aug 22 '22 15:08 zerollzeng

https://drive.google.com/drive/folders/1X2_wBV4DOykZ5eQovbYGARDUsg0ZPxwX?usp=sharing @zerollzeng The custom operator so files required for my model and the python code used for exporting are here.

Aug 23 '22 03:08 980202006

@980202006 Is the error still exist in latest 8.6? thanks!

Jul 18 '23 17:07 ttyio

I don't remember, I bypassed this problem by rewriting the torch forward inference code

Jul 21 '23 07:07 980202006

Okay, I'm closing this now. Feel free to reopen it if you have any further questions.

Jul 21 '23 09:07 zerollzeng

Description

I found out that the required weight count is twice as in the onnx model, but it's not clear how to fix this error

[08/17/2022-17:06:41] [TRT] [E] Conv_1840: count of 1474560 weights in kernel, but kernel dimensions (3,3) with 320 input channels, 512 output channels and 2 groups were specified. Expected Weights count is 320 * 3*3 * 512 / 2 = 737280
[08/17/2022-17:06:42] [TRT] [E] [convolutionNode.cpp::computeOutputExtents::43] Error Code 4: Internal Error (Conv_1840: number of kernel weights does not match tensor dimensions)
[08/17/2022-17:06:42] [TRT] [V] Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 512
[08/17/2022-17:06:42] [TRT] [V] Convolution output dimensions: ()
ERROR: Failed to parse the ONNX file: end2end.onnx
ERROR: Failed to parse the ONNX file.
got 1 errors: 
In node 1840 (parseGraph): INVALID_NODE: Invalid Node - Conv_1840
Conv_1840:kernel weights has count 1474560 but 737280 was expected
Conv_1840: count of 1474560 weights in kernel, but kernel dimensions (3,3) with 320 input channels, 512 output channels and 2 groups were specified. Expected Weights count is 320 * 3*3 * 512 / 2 = 737280
[convolutionNode.cpp::computeOutputExtents::43] Error Code 4: Internal Error (Conv_1840: number of kernel weights does not match tensor dimensions)

Environment

TensorRT Version: NVIDIA GPU: cu113 CUDA Version: 8.2 Operating System: ubuntu20.04 Python Version (if applicable): 3.7.13 PyTorch Version (if applicable): 1.12+cu113

Relevant Files

https://drive.google.com/drive/folders/1X2_wBV4DOykZ5eQovbYGARDUsg0ZPxwX?usp=sharing

Steps To Reproduce

can you tell me how fix this issue

Mar 18 '24 05:03 Liupei1101

Description

I found out that the required weight count is twice as in the onnx model, but it's not clear how to fix this error

[08/17/2022-17:06:41] [TRT] [E] Conv_1840: count of 1474560 weights in kernel, but kernel dimensions (3,3) with 320 input channels, 512 output channels and 2 groups were specified. Expected Weights count is 320 * 3*3 * 512 / 2 = 737280
[08/17/2022-17:06:42] [TRT] [E] [convolutionNode.cpp::computeOutputExtents::43] Error Code 4: Internal Error (Conv_1840: number of kernel weights does not match tensor dimensions)
[08/17/2022-17:06:42] [TRT] [V] Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 512
[08/17/2022-17:06:42] [TRT] [V] Convolution output dimensions: ()
ERROR: Failed to parse the ONNX file: end2end.onnx
ERROR: Failed to parse the ONNX file.
got 1 errors: 
In node 1840 (parseGraph): INVALID_NODE: Invalid Node - Conv_1840
Conv_1840:kernel weights has count 1474560 but 737280 was expected
Conv_1840: count of 1474560 weights in kernel, but kernel dimensions (3,3) with 320 input channels, 512 output channels and 2 groups were specified. Expected Weights count is 320 * 3*3 * 512 / 2 = 737280
[convolutionNode.cpp::computeOutputExtents::43] Error Code 4: Internal Error (Conv_1840: number of kernel weights does not match tensor dimensions)

Environment

TensorRT Version: NVIDIA GPU: cu113 CUDA Version: 8.2 Operating System: ubuntu20.04 Python Version (if applicable): 3.7.13 PyTorch Version (if applicable): 1.12+cu113

Relevant Files

https://drive.google.com/drive/folders/1X2_wBV4DOykZ5eQovbYGARDUsg0ZPxwX?usp=sharing

Steps To Reproduce

do you have fix the issue?

Apr 09 '24 03:04 Liupei1101

error node: [08/17/2022-17:06:41] [TRT] [E] Conv_1840: count of 1474560 weights in kernel, but kernel dimensions (3,3) with 320 input channels, 512 output channels and 2 groups were specified. Expected Weights count is 320 * 33 * 512 / 2 = 737280 [08/17/2022-17:06:42] [TRT] [E] [convolutionNode.cpp::computeOutputExtents::43] Error Code 4: Internal Error (Conv_1840: number of kernel weights does not match tensor dimensions) [08/17/2022-17:06:42] [TRT] [V] Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 512 [08/17/2022-17:06:42] [TRT] [V] Convolution output dimensions: () ERROR: Failed to parse the ONNX file: end2end.onnx ERROR: Failed to parse the ONNX file. got 1 errors: In node 1840 (parseGraph): INVALID_NODE: Invalid Node - Conv_1840 Conv_1840:kernel weights has count 1474560 but 737280 was expected Conv_1840: count of 1474560 weights in kernel, but kernel dimensions (3,3) with 320 input channels, 512 output channels and 2 groups were specified. Expected Weights count is 320 * 33 * 512 / 2 = 737280 [convolutionNode.cpp::computeOutputExtents::43] Error Code 4: Internal Error (Conv_1840: number of kernel weights does not match tensor dimensions) I have the same problem？I do not know why?

Apr 09 '24 07:04 Liupei1101

Failed to parse the ONNX file: end2end.onnx ERROR: Failed to parse the ONNX file.

i have the same problem, how to solve?

Apr 09 '24 07:04 Liupei1101

TensorRT TensorRT copied to clipboard

Error Code 4: Internal Error (Conv_1840: number of kernel weights does not match tensor dimensions)

Description

Environment

Relevant Files

Steps To Reproduce

Description

Environment

Relevant Files

Steps To Reproduce

Description

Environment

Relevant Files

Steps To Reproduce

TensorRT
TensorRT copied to clipboard