onnx-tensorrt icon indicating copy to clipboard operation
onnx-tensorrt copied to clipboard

Unsupported ONNX data type: UINT8 (2)

Open bnascimento opened this issue 4 years ago • 83 comments

Folowing the tutorial from the notebook https://github.com/onnx/tensorflow-onnx/blob/master/tutorials/ConvertingSSDMobilenetToONNX.ipynb I am trying to work with a mobilenetv2 and v3 frozen models from tensorflow frozen_inference_graph.pb or a saved_model.pb to convert to ONNX and to TensorRT files. Under NGC dockers 20.01-tf1-py3 and 19.05-py3 I am using both this and tensorflow-onnx projects. I alwaysget different issues, the furthest I got was under 20.01-tf1-py3 with both onnx-tensorrt and tensorflow-onnx on master branchs and install the projects from source. I was able to create the .onnx file, but when I try to create the .trt file I get the following.

onnx2trt /media/bnascimento/project/frozen_inference_graph.onnx -o /media/bnascimento/project/frozen_inference_graph.trt
----------------------------------------------------------------
Input filename:   /media/bnascimento/project/frozen_inference_graph.onnx
ONNX IR version:  0.0.6
Opset version:    10
Producer name:    tf2onnx
Producer version: 1.6.0
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
Parsing model
Unsupported ONNX data type: UINT8 (2)
ERROR: image_tensor:0:190 In function importInput:
[8] Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype)

I suspect this has to do with the input tensor for the image, but I dont know how to avoid this issue. Anyone with similar issues before?

Cheers Bruno

bnascimento avatar Feb 21 '20 14:02 bnascimento

@bnascimento I get the same error when parsing a model. Did you manage to resolve you issue?

qraleq avatar Mar 15 '20 17:03 qraleq


Input filename: model.onnx
ONNX IR version: 0.0.6 Opset version: 11
Producer name: tf2onnx Producer version: 1.5.5
Domain: Model version: 0
Doc string: ----------------------------------------------------------------

Writing ONNX model (without weights) as text to my_engine.txt Parsing model Unsupported ONNX data type: UINT8 (2) ERROR: image_tensor:0:190 In function importInput: [8] Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype)

aif2017 avatar Mar 19 '20 11:03 aif2017

Folowing the tutorial from the notebook https://github.com/onnx/tensorflow-onnx/blob/master/tutorials/ConvertingSSDMobilenetToONNX.ipynb I am trying to work with a mobilenetv2 and v3 frozen models from tensorflow frozen_inference_graph.pb or a saved_model.pb to convert to ONNX and to TensorRT files. Under NGC dockers 20.01-tf1-py3 and 19.05-py3 I am using both this and tensorflow-onnx projects. I alwaysget different issues, the furthest I got was under 20.01-tf1-py3 with both onnx-tensorrt and tensorflow-onnx on master branchs and install the projects from source. I was able to create the .onnx file, but when I try to create the .trt file I get the following.

onnx2trt /media/bnascimento/project/frozen_inference_graph.onnx -o /media/bnascimento/project/frozen_inference_graph.trt
----------------------------------------------------------------
Input filename:   /media/bnascimento/project/frozen_inference_graph.onnx
ONNX IR version:  0.0.6
Opset version:    10
Producer name:    tf2onnx
Producer version: 1.6.0
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
Parsing model
Unsupported ONNX data type: UINT8 (2)
ERROR: image_tensor:0:190 In function importInput:
[8] Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype)

I suspect this has to do with the input tensor for the image, but I dont know how to avoid this issue. Anyone with similar issues before?

Cheers Bruno

do you found way to escape this?

aif2017 avatar Mar 20 '20 14:03 aif2017

????????????????????????????

aif2017 avatar Mar 20 '20 14:03 aif2017

??????????????????????????????????

aif2017 avatar Mar 28 '20 10:03 aif2017

TRT cannot support UINT8 datatype. It means your model already used the uint8 datatype. Check here: https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/infer/FoundationalTypes/DataType.html

chiehpower avatar Apr 01 '20 09:04 chiehpower

TRT cannot support UINT8 datatype. It means your model already used the uint8 datatype. Check here: https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/infer/FoundationalTypes/DataType.html

thanks, but this node is input image that in second step convert to float32!

ai-vip2020 avatar Apr 23 '20 02:04 ai-vip2020

TRT cannot support UINT8 datatype. It means your model already used the uint8 datatype. Check here: https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/infer/FoundationalTypes/DataType.html

s

ai-vip2020 avatar Apr 23 '20 02:04 ai-vip2020

Any update?

WhaSukGO avatar May 08 '20 03:05 WhaSukGO

same problem. Any update?

hfinger avatar May 26 '20 12:05 hfinger

Same problem as here https://forums.developer.nvidia.com/t/unsupported-onnx-data-type-uint8-2/75044/10

turowicz avatar Jun 05 '20 08:06 turowicz

Any solutions to this problem??

WARNING: ONNX model has a newer ir_version (0.0.5) than this parser was built against (0.0.3). Unsupported ONNX data type: UINT8 (2) ERROR: ModelImporter.cpp:54 In function importInput: [8] Assertion failed: convert_dtype(onnx_tensor_type.elem_type(), &trt_dtype) [05/29/2020-10:13:46] [E] Failed to parse onnx file [05/29/2020-10:13:46] [E] Parsing model failed [05/29/2020-10:13:46] [E] Engine could not be created &&&& FAILED TensorRT.trtexec # ./trtexec --onnx=inception_standard.onnx

Ram-Godavarthi avatar Jun 29 '20 09:06 Ram-Godavarthi

Hey! Even I have the same problem. Any Solutions? Unsupported ONNX data type: UINT8 (2) ERROR: batch:1:191 In function importInput: [8] Assertion failed: convertDtype(onnxDtype.elem_type(), &trtDtype) [06/29/2020-16:30:09] [E] Failed to parse onnx file [06/29/2020-16:30:09] [E] Parsing model failed [06/29/2020-16:30:09] [E] Engine creation failed [06/29/2020-16:30:09] [E] Engine set up failed &&&& FAILED TensorRT.trtexec # trtexec --onnx=/home/xyz/Downloads/train_batch_shape.onnx --shapes=input_3:1x200x200x3

Guneetkaur03 avatar Jun 30 '20 05:06 Guneetkaur03

Unsupported ONNX data type: UINT8 (2) Have anyone solved this ?

pranavk2050 avatar Aug 14 '20 12:08 pranavk2050

First, I have to say that I haven't had this janky experience with software in years. Working with this ONNX and TensorRT ecosystem is a complete nightmare.

Second, I was able to solve the UINT8 problem by using the code from this NVIDIA Developers forum post: https://forums.developer.nvidia.com/t/problem-converting-onnx-model-to-tensorrt-engine-for-ssd-mobilenet-v2/139337/16

This fixes the original frozen_inference_graph.pb file, which then needs to be converted to ONNX and then to TensorRT.

douglasrizzo avatar Aug 31 '20 03:08 douglasrizzo

Here are the steps I did, but ended up failing anyway.

Step 1: fix UINT8 error

Here is a script that generates a new frozen inference graph with float inputs from one with int inputs:

Suppose it's called fix_uint8.py. Its usage is: python fix_uint8.py frozen_inference_graph.pb fixed_inference_graph.pb

import tensorflow as tf
import graphsurgeon as gs
import sys

graph = gs.DynamicGraph(sys.argv[1])
image_tensor = graph.find_nodes_by_name('image_tensor')

print('Found Input: ', image_tensor)

cast_node = graph.find_nodes_by_name('Cast')[0] #Replace Cast with ToFloat if using tensorflow <1.15
print('Old field', cast_node.attr['SrcT'])

cast_node.attr['SrcT'].type=1 #Changing Expected type to float
print('New field', cast_node.attr['SrcT'])

input_node = gs.create_plugin_node(name='InputNode', op='Placeholder', shape=(-1, -1, -1, 3), dtype=tf.float32)
namespace_plugin_map = {'image_tensor': input_node}
graph.collapse_namespaces(namespace_plugin_map)
graph.write(sys.argv[2])

Step 2: generate ONNX file from fixed .pb file

Let's say I fixed a file and called it mobilenet_v2_0.35_128.pb. I then call tf2onnx on this file:

python -m tf2onnx.convert --input mobilenet_v2_0.35_128.pb --inputs InputNode:0 --output mobilenet_v2_0.35_128.onnx --opset 11 --outputs detection_boxes:0,detection_scores:0,detection_multiclass_scores:0,detection_classes:0,num_detections:0,raw_detection_boxes:0,raw_detection_scores:0

2020-08-31 05:32:04,426 - INFO - Using tensorflow=1.15.0, onnx=1.7.0, tf2onnx=1.6.3/d4abc8
2020-08-31 05:32:04,426 - INFO - Using opset <onnx, 11>
2020-08-31 05:32:10,228 - INFO - Optimizing ONNX model
2020-08-31 05:32:28,812 - INFO - After optimization: BatchNormalization -53 (60->7), Cast -34 (131->97), Const -578 (916->338), Gather +6 (29->35), Identity -129 (130->1), Less -2 (10->8), Mul -2 (37->35), Reshape -15 (45->30), Shape -8 (33->25), Slice -7 (56->49), Squeeze -22 (73->51), Transpose -272 (291->19), Unsqueeze -63 (102->39)
2020-08-31 05:32:28,896 - INFO -
2020-08-31 05:32:28,896 - INFO - Successfully converted TensorFlow model mobilenet_v2_0.35_128.pb to ONNX
2020-08-31 05:32:28,925 - INFO - ONNX model is saved at mobilenet_v2_0.35_128.onnx

Step 3: generate TensorRT "engine" from ONNX file

Lastly, I call onnx2trt:

onnx2trt mobilenet_v2_0.35_128.onnx -o mobilenet_v2_0.35_128_engine.trt
----------------------------------------------------------------
Input filename:   mobilenet_v2_0.35_128.onnx
ONNX IR version:  0.0.6
Opset version:    11
Producer name:    tf2onnx
Producer version: 1.6.3
Domain:
Model version:    0
Doc string:
----------------------------------------------------------------
Parsing model
[2020-08-31 08:27:24 WARNING] [TRT]/home/user/Code/onnx-tensorrt/onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[2020-08-31 08:27:24 WARNING] [TRT]/home/user/Code/onnx-tensorrt/onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[2020-08-31 08:27:24 WARNING] [TRT]/home/user/Code/onnx-tensorrt/onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[2020-08-31 08:27:24   ERROR] INVALID_ARGUMENT: getPluginCreator could not find plugin NonMaxSuppression version 1
While parsing node number 306 [Loop -> "unused_loop_output___73"]:
ERROR: /home/user/Code/onnx-tensorrt/builtin_op_importers.cpp:3713 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

I've trained my network using TF 1.15, but I get this error even when I execute these steps with either TF 2.3 or 1.15.

douglasrizzo avatar Aug 31 '20 08:08 douglasrizzo

@douglasrizzo which model are you training?

turowicz avatar Sep 01 '20 11:09 turowicz

@turowicz I am training MobileNet v2 and v3 models from the TensorFlow Object Detection API. I get pre-trained models from here and train them on a custom dataset (for object detection, not classification).

The "non-max suppression" operation that seems to be giving trouble to TensorRT is specific to object detection tasks. It basically consists of removing multiple bounding boxes that may be predicted on top of the same object, returning only the one with highest confidence.

douglasrizzo avatar Sep 01 '20 14:09 douglasrizzo

@douglasrizzo did you find any solution? Can you please share? Thanks.

cognitiveRobot avatar Sep 26 '20 05:09 cognitiveRobot

@cognitiveRobot I ditched TensorRT and the Jetson and did inference in an Intel NUC, directly in the CPU.

douglasrizzo avatar Sep 26 '20 06:09 douglasrizzo

@douglasrizzo thanks a lot. That could be a solution for us too.

How much FPS you get for your models on Intel NUC?

What is the size of your input images?

Can you please share these? It will really help me to make our final decisions.

cognitiveRobot avatar Sep 27 '20 00:09 cognitiveRobot

@cognitiveRobot oh boy oh boy, do I have answers for you. I trained all MobileNetV2 and V3 models from this page with a width multiplier of 1 or less to detect a single class (soccer balls). I then collected the mean inference time for a single frame on a 30 second video, both in a Tesla V100 GPU and an Intel i5-4210U. You can see the results below.

The i5 is between 1.3 and 1.5 times slower than the V100, but you have to be aware that this depends a lot on the implementation. The TF Object Detection API is pretty fast for inference in CPUs. On the other hand, the official YOLOv4 has an inference time of 50 ms on the V100 and a whooping 5 seconds on our feeble CPU.

image

As for the inference time when processing images of different sizes:

  • 1920x1080: ~85 ms
  • 1280x720: ~70 ms
  • 640x480: ~59 ms
  • 480x360: ~56 ms

Just bear in mind that the MobileNets already scale down images before processing them, so it may be a good idea for you to configure your camera/input feed to have low resolutions too. It should matter little for the network.

douglasrizzo avatar Sep 27 '20 01:09 douglasrizzo

@douglasrizzo thanks a lot again. It will be really helpful.

cognitiveRobot avatar Sep 27 '20 03:09 cognitiveRobot

Here are the steps I did, but ended up failing anyway.

Step 1: fix UINT8 error

Here is a script that generates a new frozen inference graph with float inputs from one with int inputs:

Suppose it's called fix_uint8.py. Its usage is: python fix_uint8.py frozen_inference_graph.pb fixed_inference_graph.pb

import tensorflow as tf
import graphsurgeon as gs
import sys

graph = gs.DynamicGraph(sys.argv[1])
image_tensor = graph.find_nodes_by_name('image_tensor')

print('Found Input: ', image_tensor)

cast_node = graph.find_nodes_by_name('Cast')[0] #Replace Cast with ToFloat if using tensorflow <1.15
print('Old field', cast_node.attr['SrcT'])

cast_node.attr['SrcT'].type=1 #Changing Expected type to float
print('New field', cast_node.attr['SrcT'])

input_node = gs.create_plugin_node(name='InputNode', op='Placeholder', shape=(-1, -1, -1, 3), dtype=tf.float32)
namespace_plugin_map = {'image_tensor': input_node}
graph.collapse_namespaces(namespace_plugin_map)
graph.write(sys.argv[2])

Step 2: generate ONNX file from fixed .pb file

Let's say I fixed a file and called it mobilenet_v2_0.35_128.pb. I then call tf2onnx on this file:

python -m tf2onnx.convert --input mobilenet_v2_0.35_128.pb --inputs InputNode:0 --output mobilenet_v2_0.35_128.onnx --opset 11 --outputs detection_boxes:0,detection_scores:0,detection_multiclass_scores:0,detection_classes:0,num_detections:0,raw_detection_boxes:0,raw_detection_scores:0

2020-08-31 05:32:04,426 - INFO - Using tensorflow=1.15.0, onnx=1.7.0, tf2onnx=1.6.3/d4abc8
2020-08-31 05:32:04,426 - INFO - Using opset <onnx, 11>
2020-08-31 05:32:10,228 - INFO - Optimizing ONNX model
2020-08-31 05:32:28,812 - INFO - After optimization: BatchNormalization -53 (60->7), Cast -34 (131->97), Const -578 (916->338), Gather +6 (29->35), Identity -129 (130->1), Less -2 (10->8), Mul -2 (37->35), Reshape -15 (45->30), Shape -8 (33->25), Slice -7 (56->49), Squeeze -22 (73->51), Transpose -272 (291->19), Unsqueeze -63 (102->39)
2020-08-31 05:32:28,896 - INFO -
2020-08-31 05:32:28,896 - INFO - Successfully converted TensorFlow model mobilenet_v2_0.35_128.pb to ONNX
2020-08-31 05:32:28,925 - INFO - ONNX model is saved at mobilenet_v2_0.35_128.onnx

Step 3: generate TensorRT "engine" from ONNX file

Lastly, I call onnx2trt:

onnx2trt mobilenet_v2_0.35_128.onnx -o mobilenet_v2_0.35_128_engine.trt
----------------------------------------------------------------
Input filename:   mobilenet_v2_0.35_128.onnx
ONNX IR version:  0.0.6
Opset version:    11
Producer name:    tf2onnx
Producer version: 1.6.3
Domain:
Model version:    0
Doc string:
----------------------------------------------------------------
Parsing model
[2020-08-31 08:27:24 WARNING] [TRT]/home/user/Code/onnx-tensorrt/onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[2020-08-31 08:27:24 WARNING] [TRT]/home/user/Code/onnx-tensorrt/onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[2020-08-31 08:27:24 WARNING] [TRT]/home/user/Code/onnx-tensorrt/onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[2020-08-31 08:27:24   ERROR] INVALID_ARGUMENT: getPluginCreator could not find plugin NonMaxSuppression version 1
While parsing node number 306 [Loop -> "unused_loop_output___73"]:
ERROR: /home/user/Code/onnx-tensorrt/builtin_op_importers.cpp:3713 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

I've trained my network using TF 1.15, but I get this error even when I execute these steps with either TF 2.3 or 1.15.

INVALID_ARGUMENT: getPluginCreator could not find plugin NonMaxSuppression version 1

Anyone else stumbled into this issue? We also need a solution.

bnascimento avatar Oct 27 '20 10:10 bnascimento

@bnascimento non-max suppression is an "operation" that seems to be implemented in TensorFlow and ONNX, but not in TensorRT, so converting any model that uses non-max suppression in its architecture to TensorRT is going to fail.

I believe the solution would be to implement it in TensorRT...

douglasrizzo avatar Oct 27 '20 19:10 douglasrizzo

Hi @douglasrizzo , I've been looking into this issue and it seems that tensorRT has this operation under its plugins. See https://github.com/NVIDIA/TensorRT/tree/master/plugin/batchedNMSPlugin or https://github.com/NVIDIA/TensorRT/tree/master/plugin/nmsPlugin The reason might be because they are very specific operations, mostly used on object detection for example.

There are other people with similar issue as this, that have layed out different approachs, but so far I've been unsucessful. See https://github.com/NVIDIA/TensorRT/issues/795

bnascimento avatar Oct 27 '20 20:10 bnascimento

@bnascimento Try to split the tensorflow graph at the position before nms. Then you will get two graphs of: 'network_forward' and 'postpreprocess'. And just converting the 'network_forward ' part to TensorRT.

This link might be helpful. realtime_object_detection

qin2294096 avatar Oct 30 '20 03:10 qin2294096

Hey guys I too had this same problem and maybe this script can help as it helped me

import onnx

def change_input_datatype(model, typeNdx): # values for typeNdx # 1 = float32 # 2 = uint8 # 3 = int8 # 4 = uint16 # 5 = int16 # 6 = int32 # 7 = int64 inputs = model.graph.input for input in inputs: input.type.tensor_type.elem_type = typeNdx dtype = input.type.tensor_type.elem_type

def change_input_batchsize(model, batchSize): inputs = model.graph.input for input in inputs:
dim1 = input.type.tensor_type.shape.dim[0]
dim1.dim_value = batchSize #print("input: ", input) # uncomment to see input layer details

def change_output_batchsize(model, batchSize):
outputs = model.graph.output for output in outputs:
dim1 = output.type.tensor_type.shape.dim[0]
dim1.dim_value = batchSize #print("output: ", output) #uncomment to see output layer details

onnx_model = onnx.load()

change_input_datatype(onnx_model, 1) change_input_batchsize(onnx_model, 1) change_output_batchsize(onnx_model, 1)

onnx.save(onnx_model, )

Here we can change the data type of the input tensor. Resource: https://forums.developer.nvidia.com/t/unsupported-onnx-data-type-uint8-2/75044/16?u=karanprojectx

absolution747 avatar Nov 04 '20 05:11 absolution747

Very similar problem with the CumSum operator on a PyTorch RoBERTa implementation, exported at ONNX opsec 11:

import onnx
import onnxruntime
import onnx_tensorrt.backend as backend
model = onnx.load('/workspace/models/onnx-my-32.model')
engine = backend.prepare(model)
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1115620964
[libprotobuf WARNING /workspace/TensorRT/build/third_party.protobuf/src/third_party.protobuf/src/google/protobuf/io/coded_stream.cc:604] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING /workspace/TensorRT/build/third_party.protobuf/src/third_party.protobuf/src/google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 1115620964
[TensorRT] WARNING: /workspace/TensorRT/parsers/onnx/onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting
to cast down to INT32.
[TensorRT] ERROR: INVALID_ARGUMENT: getPluginCreator could not find plugin CumSum version 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/workspace/TensorRT/parsers/onnx/onnx_tensorrt/backend.py", line 254, in prepare
    return TensorRTBackendRep(model, device, **kwargs)
  File "/workspace/TensorRT/parsers/onnx/onnx_tensorrt/backend.py", line 92, in __init__
    raise RuntimeError(msg)

The error: [TensorRT] ERROR: INVALID_ARGUMENT: getPluginCreator could not find plugin CumSum version 1

Reassigning elem_type like @absolution747 pointed out does not solve this, only removes the INT64 warning.

mihajenko avatar Nov 08 '20 22:11 mihajenko

I have the same error with my code. I find a tool can solve the problem here. I find the way here.

  1. Install ONNX Graphsurgeon API
$ sudo apt-get install python3-pip libprotobuf-dev protobuf-compiler
$ git clone https://github.com/NVIDIA/TensorRT.git
$ cd TensorRT/tools/onnx-graphsurgeon/
$ make install
  1. Modify your model
import onnx_graphsurgeon as gs
import onnx
import numpy as np

graph = gs.import_onnx(onnx.load("model.onnx"))
for inp in graph.inputs:
    inp.dtype = np.float32

onnx.save(gs.export_onnx(graph), "updated_model.onnx")

CMangoDH avatar Nov 19 '20 09:11 CMangoDH