inference icon indicating copy to clipboard operation
inference copied to clipboard

Jetson 4.6.1: opset 17 not supported for onnxruntime-gpu 1.11.0

Open TomasBooneHogent opened this issue 1 year ago • 14 comments

Search before asking

  • [X] I have searched the Inference issues and found no similar bug report.

Bug

image image

Environment

NVIDIA Jetson-AGX L4T 32.6.1 [ JetPack 4.6 ] Ubuntu 18.04.5 LTS Kernel Version: 4.9.253-tegra CUDA 10.2.300 CUDA Architecture: 7.2 OpenCV version: 4.1.1 OpenCV Cuda: NO CUDNN: 8.2.1.32 TensorRT: 8.2.1.9 Vision Works: 1.6.0.501 VPI: 1.2.3 Vulcan: 1.2.70

Minimal Reproducible Example

No response

Additional

Jetpack 4.6.1 supports untill onnxruntime 1.11.0 onnxruntime supports untill opset 16 roboflow models are opset 17 models In need of conversion to opset 16 models for jetson 4.6.1 image

Are you willing to submit a PR?

  • [ ] Yes I'd like to help by submitting a PR!

TomasBooneHogent avatar May 06 '24 09:05 TomasBooneHogent

hi there,

Thanks for reporting the problem - may I ask when the model was trained? (I assume u trained model @ Roboflow platform) I am asking as we've this problem reported and reverted changes making models being opset 17 into 16 again.

PawelPeczek-Roboflow avatar May 07 '24 07:05 PawelPeczek-Roboflow

When I update the inference package manually the opset issue is solved indeed, but the CUDAExecutionProvider cannot be found and therefore defaults to the CPUExecutionProvider. Are you sure the latest inference code is compatible with CUDA 10.2.3?

TomasBooneHogent avatar May 07 '24 09:05 TomasBooneHogent

yolov8s (OD) Generated on Feb 5, 2024 = no issue

yolov8s (OD) Generated on Feb 22, 2024 = no issue

yolov8s (OD) Generated on Mar 5, 2024 = Issue

yolov8s (OD) Generated on Mar 20, 2024 = issue

yolov8s (OD) Generated on Mar 25, 2024 = issue

Roboflow 3.0 (OD) Generated on Mar 13, 2024 = NO issue

Roboflow 3.0 (OD) Generated on Mar 18, 2024 = No issue

Roboflow 3.0 (OD) Generated on Mar 18, 2024 = NO issue

Roboflow 3.0 (OD) Generated on Apr 19, 2024 = Issue

yolov8s (IS) Generated on Feb 26, 2024 = no issue

yolov8l (IS) Generated on Mar 5, 2024 = issue

yolov8s (IS) Generated on Mar 6, 2024 = issue

yolov8s (IS) Generated on Mar 14, 2024 = issue

yolov8s (IS) Generated on Mar 28, 2024 = issue

Roboflow 3.0 (IS) Generated on Mar 14, 2024 = NO issue

Roboflow 3.0 (IS) Generated on Apr 17, 2024 = issue

TomasBooneHogent avatar May 07 '24 09:05 TomasBooneHogent

OD = Object detection IS = Instance Segmentation

TomasBooneHogent avatar May 07 '24 09:05 TomasBooneHogent

So only new models trained will be compatible again?

TomasBooneHogent avatar May 07 '24 09:05 TomasBooneHogent

Let's connect through e-mail ([email protected]) What u report is indeed worrying - I would like to be able to take a look at models artefacts to verify what's going on, but I would need to know internal details about ur project at the platform to figure out the issue

PawelPeczek-Roboflow avatar May 07 '24 10:05 PawelPeczek-Roboflow

I already reported and shown the issue to Jack Gallo, he knows the details

TomasBooneHogent avatar May 07 '24 10:05 TomasBooneHogent

It would basically be one line of code if you do Yolo.export(format="onnx", opset=16) if onnxruntime version < 1.12.0

TomasBooneHogent avatar May 07 '24 10:05 TomasBooneHogent

Ok, I will ask Jack. and yes, we also though so in terms of solution

PawelPeczek-Roboflow avatar May 07 '24 10:05 PawelPeczek-Roboflow

@PawelPeczek-Roboflow

image I have the same issue using a classification model in a 2 stage detection setup. The first object detection model works fine, but the second classification model is opset 17 and thus again incompatible... I also fixed a bug where _sqlite module was not found because you forgot to install it in the jetson4.6.1 dockerfile.

TomasBooneHogent avatar Oct 21 '24 13:10 TomasBooneHogent

Hi, we will likely be able to fix this as a stopgap but please note that Jetpack 4 is EOL and no longer supported by NVIDIA. Please consider upgrading. We will likely drop support soon as well.

yeldarby avatar Oct 21 '24 13:10 yeldarby

@yeldarby thank you! Once we get a classification model working, we should normally no longer need updates.

TomasBooneHogent avatar Oct 21 '24 13:10 TomasBooneHogent

@yeldarby image this fixed the "module _sqlite not found error"

Adding libsqlite3-dev as an install before installing python in the jetson4.6.1 dockerfile

TomasBooneHogent avatar Oct 21 '24 13:10 TomasBooneHogent

@yeldarby any updates? We need it in production..

TomasBooneHogent avatar Oct 23 '24 09:10 TomasBooneHogent