onnx-tensorrt [optimizer.cpp::computeCosts::1981] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[Reshape_419 + Transpose_420...Gather

[optimizer.cpp::computeCosts::1981] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[Reshape_419 + Transpose_420...Gather_2423]}

I have an ONNX model, which can be succesfully converted to trt using tensorrt 7.3, but when I upgrade tensorrt 8 and using onnx-tensorrt with master branch failed.

Any idea how above error comes?

All the ops in my model are:

Exploring on onnx model: detr_sim.onnx_changed.onnx
ONNX model sum on: detr_sim.onnx_changed.onnx


-------------------------------------------
ir version: 7
opset_import: 12 
producer_name: 
doc_string: 
all ops used: Split,Squeeze,Pad,Unsqueeze,Concat,Conv,Mul,Add,Relu,MaxPool,Reshape,Transpose,MatMul,Div,Softmax,ReduceMean,Sub,Pow,Sqrt,Sigmoid,Gather

these ops all supported already.

Full output:

----------------------------------------------------------------
Input filename:   detr_sim.onnx_changed.onnx
ONNX IR version:  0.0.7
Opset version:    12
Producer name:    
Producer version: 
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
Parsing model
[2021-10-11 11:40:22 WARNING] onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
Building TensorRT engine, FP16 available:1
    Max batch size:     32
    Max workspace size: 2288.71 MiB
[2021-10-11 11:40:40 WARNING] Skipping tactic 0 due to Myelin error: autotuning: CUDA error 3 allocating 0-byte buffer: 
[2021-10-11 11:40:41   ERROR] 10: [optimizer.cpp::computeCosts::1981] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[Reshape_419 + Transpose_420...Gather_2423]}.)
terminate called after throwing an instance of 'std::runtime_error'
  what():  Failed to create object
[1]    264517 abort (core dumped)  onnx2trt detr_sim.onnx_changed.onnx -o detr.trt -w 2399889023

Oct 11 '21 11:10 lucasjinreal

Are you able to share your model?

Nov 05 '21 19:11 kevinch-nv

the same issue. [12/16/2021-22:43:11] [E] Error[10]: [optimizer.cpp::computeCosts::1855] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[(Unnamed Layer* 354) [LoopOutput][length][Constant]...Sigmoid_147]}.)

Dec 16 '21 14:12 xk-wang

Are you fix it?

Jan 01 '22 01:01 sushe2111

Hello, Are you fix it?

Jan 10 '22 08:01 tiny-cold-hands

Are you fix it?

yes, I use tensorrt instead onnx-tensorrt to convert onnx to tensort engine file and finally success. I think the reason maybe the tensortRT I use is to new. Give the code how I did.

import sys
import os
import argparse
import tensorrt as trt

EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="PyTorch Object Detection Inference")
    parser.add_argument("onnx_path", type=str)
    parser.add_argument("trt_path", type=str)
    args = parser.parse_args()
    onnx_file_path = args.onnx_path
    engine_file_path = args.trt_path
    print('get start')
    TRT_LOGGER = trt.Logger()
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        config = builder.create_builder_config()
        config.max_workspace_size =( 1 << 30 ) * 2 # 2 GB
        builder.max_batch_size = 16
        config.set_flag(trt.BuilderFlag.FP16)
        # builder.fp16_mode = True
        # Parse model file
        if not os.path.exists(onnx_file_path):
            print('ONNX file {} not found, please run yolov3_to_onnx.py first to generate it.'.format(onnx_file_path))
            exit(0)
        print('Loading ONNX file from path {}...'.format(onnx_file_path))
        with open(onnx_file_path, 'rb') as model:
            print('Beginning ONNX file parsing')
            if not parser.parse(model.read()):
                print ('ERROR: Failed to parse the ONNX file.')
                for error in range(parser.num_errors):
                    print (parser.get_error(error))
        print(f"raw shape of {network.get_input(0).name} is: ", network.get_input(0).shape)
        # network.get_input(0).shape = [-1, 3, 32, -1] #dynamic model example
        for i in range(1):
            profile = builder.create_optimization_profile()
            # min usual max
            profile.set_shape(network.get_input(0).name, (1, 512, 229), (12, 512, 229), (16, 512, 229))
            config.add_optimization_profile(profile)
        print('Completed parsing of ONNX file')
        print('Building an engine from file {}; this may take a while...'.format(onnx_file_path))
        engine = builder.build_engine(network,config)
        print("Completed creating Engine")
        with open(engine_file_path, "wb") as f:
            f.write(engine.serialize())

Jan 12 '22 12:01 xk-wang

I met the same error with trt 8.2.3.0

[01/28/2022-07:54:29] [TRT] [E] 10: [optimizer.cpp::computeCosts::2011] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[113 + (Unnamed Layer* 72) [Shuffle]...(Unnamed Layer* 171) [Shuffle]]}.)

With trt 7.2.3 all works fine.

In both cases i use trt.OnnxParser.

Jan 28 '22 04:01 akhoroshev

Building an engine from file gpt2-pretrained.onnx; this may take a while...
[02/19/2022-21:14:43] [TRT] [W] Skipping tactic 0 due to insuficient memory on requested size of 1170309120 detected for tactic 0.
[02/19/2022-21:14:43] [TRT] [E] 10: [optimizer.cpp::computeCosts::2011] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[transformer.wte.weight...MatMul_2899]}.)
[02/19/2022-21:14:43] [TRT] [E] 2: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )

face the same error with trt 8.2.2.1, and solved by increasing the workspace size.

Feb 19 '22 21:02 TaylorHere

I met the same error with trt 8.2.2. Did anyone fix it?

Apr 08 '22 10:04 YukSing12

@kevinch-nv I had a similar issue, maybe its related? https://github.com/NVIDIA/TensorRT/issues/1581

I added the model in the issue

Apr 08 '22 12:04 andreabrduque

I had a similar issue in tensorrt:22.02-py3 as the following,

[optimizer.cpp::computeCosts::2011] Error Code 10: Internal Error (Could not find any implementation for node

and solved by increasing the workspace size.

Apr 11 '22 17:04 dave-rtzr

Same issue with the bert model exported by transformers[onnx] on V100. Can not solve by increasing workspace size.

[04/22/2022-21:36:34] [TRT] [W] onnx2trt_utils.cpp:365: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/22/2022-21:36:38] [TRT] [W] Skipping tactic 0 due to Myelin error: autotuning: CUDA error 3 allocating 0-byte buffer: 
[04/22/2022-21:36:38] [TRT] [E] 10: [optimizer.cpp::computeCosts::2033] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[embeddings.position_embeddings.weight...Tanh_1160]}.)

>>> print(tensorrt.__version__)
8.4.0.6

GPU driver 510.47.03, cuda tool kit 11.6.

It is interesting that the same code and model on RTX 3070 Laptop work fine.

Apr 22 '22 21:04 yaoyaoding

@yaoyaoding max_workspace_size seems has been deprecated in 8.4. And I find a new api set_memory_pool_limit. I haven't test it yet.

Apr 25 '22 13:04 grimoire

@grimoire I have tried both

config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, ...)

and

config.max_workspace_size = ...

Neither works.

Apr 25 '22 20:04 yaoyaoding

I also meet the bug--------->[06/23/2022-09:02:18] [TRT] [E] 10: [optimizer.cpp::computeCosts::3826] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[encoder.cross_views.0.cross_attend.to_q.1.bias + (Unnamed Layer* 212) [Shuffle]...Transpose_749]}.) [06/23/2022-09:02:18] [TRT] [E] 2: [builder.cpp::buildSerializedNetwork::620] Error Code 2: Internal Error (Assertion engine != nullptr failed. )

Jun 23 '22 01:06 scuizhibin

Hello @kevinch-nv, I met the error in env "Tensorrt8.2.5.1+cu11.3+cudnn8.2, with GTX3090." But it does not appear with "Tensorrt8.2.5.1+cu11.3+cudnn8.2,with GTX1050Ti." So, i guess the reason is the difference of nvidia-graphics. I also tried the newest Tensor8.4(Tensorrt8.4.0GA+cuda11.6+cudnn8.4), the same error still exist. [07/01/2022-10:46:09] [W] [TRT] Skipping tactic 0x0000000000000000 due to Myelin error: autotuning: CUDA error 3 allocating 0-byte buffer: [07/01/2022-10:46:09] [E] Error[10]: [optimizer.cpp::computeCosts::3628] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[(Unnamed Layer* 126) [Constant] + (Unnamed Layer* 127) [Shuffle]...Unsqueeze_205]}.) [07/01/2022-10:46:09] [E] Error[2]: [builder.cpp::buildSerializedNetwork::636] Error Code 2: Internal Error (Assertion engine != nullptr failed. )

Jul 01 '22 02:07 zhaotieMeng

Hello @kevinch-nv, I met the error in env "Tensorrt8.2.5.1+cu11.3+cudnn8.2, with GTX3090." But it does not appear with "Tensorrt8.2.5.1+cu11.3+cudnn8.2,with GTX1050Ti." So, i guess the reason is the difference of nvidia-graphics. I also tried the newest Tensor8.4(Tensorrt8.4.0GA+cuda11.6+cudnn8.4), the same error still exist. [07/01/2022-10:46:09] [W] [TRT] Skipping tactic 0x0000000000000000 due to Myelin error: autotuning: CUDA error 3 allocating 0-byte buffer: [07/01/2022-10:46:09] [E] Error[10]: [optimizer.cpp::computeCosts::3628] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[(Unnamed Layer* 126) [Constant] + (Unnamed Layer* 127) [Shuffle]...Unsqueeze_205]}.) [07/01/2022-10:46:09] [E] Error[2]: [builder.cpp::buildSerializedNetwork::636] Error Code 2: Internal Error (Assertion engine != nullptr failed. ) @zhaotieMeng did you fix it ?

Apr 26 '23 10:04 duanchengwen

I have the same error with Tensorrt8.2.3.0+cu11.2+P40 and all process success with Tensorrt8.6.1.6.

Jun 07 '23 11:06 acginf

same error with Tensorrt 8.4.0.6 + cuda11.2 + A10... going to try Tensorrt8.6

Mar 11 '24 09:03 santiweide

Building an engine from file gpt2-pretrained.onnx; this may take a while...
[02/19/2022-21:14:43] [TRT] [W] Skipping tactic 0 due to insuficient memory on requested size of 1170309120 detected for tactic 0.
[02/19/2022-21:14:43] [TRT] [E] 10: [optimizer.cpp::computeCosts::2011] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[transformer.wte.weight...MatMul_2899]}.)
[02/19/2022-21:14:43] [TRT] [E] 2: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )

face the same error with trt 8.2.2.1, and solved by increasing the workspace size.

how much workspace is need in your case?

Mar 27 '24 11:03 gaosanyuan

onnx-tensorrt onnx-tensorrt copied to clipboard

[optimizer.cpp::computeCosts::1981] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[Reshape_419 + Transpose_420...Gather_2423]}

onnx-tensorrt
onnx-tensorrt copied to clipboard