DeepStream-Yolo icon indicating copy to clipboard operation
DeepStream-Yolo copied to clipboard

Yolov5s model int8 calibration core dump issue

Open jessie-zhao opened this issue 2 years ago • 14 comments

hi ALL.

I am running Deepstream6.1 on A10 on ubuntu20.04, when run yolov5s model with int8 calbiratio, got below issue. can someone help with this

#deepstream-app -c ./deepstream_app_config.txt

。。。 Total number of YOLO layers: 272

Building YOLO network complete Building the TensorRT Engine

NOTE: letter_box is set in cfg file, make sure to set maintain-aspect-ratio=1 in config_infer file to get better accuracy

File does not exist: /opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/calib.table WARNING: [TRT]: TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.4.0 WARNING: [TRT]: TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.4.0 WARNING: [TRT]: TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.4.0 Load image: /opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/calibration/000000000139.jpg Progress: 0.1% CUDA failure: no error in file yoloPlugins.cpp at line 252 Aborted (core dumped)

jessie-zhao avatar Aug 09 '22 11:08 jessie-zhao

I can't repoduce this issue, can you send more details?

marcoslucianops avatar Aug 12 '22 20:08 marcoslucianops

thanks a lot for your great help. Which kinds of detail? I can run deepstream 6.1 with Yolov5s(v6.1) Fp32/FP16 precision successfully, but failed with INT8. I confirmed that I had modified the conf_inf file and add calbration dataset according to guide.

Below is int-conf file

[property] gpu-id=0 net-scale-factor=0.0039215697906911373 model-color-format=0 custom-network-config=yolov5s.cfg model-file=yolov5s.wts model-engine-file=model_b1_gpu0_int8.engine int8-calib-file=calib.table labelfile-path=labels.txt batch-size=1 network-mode=1 num-detected-classes=80 interval=0 gie-unique-id=1 process-mode=1 network-type=0 cluster-mode=4 maintain-aspect-ratio=1 parse-bbox-func-name=NvDsInferParseYolo custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so engine-create-func-name=NvDsInferYoloCudaEngineGet

[class-attrs-all] pre-cluster-threshold=0

Deepstream version is 6.1. NV device is A10

jessie-zhao avatar Aug 12 '22 22:08 jessie-zhao

I did a re install and same error is as below deepstream-app -c ./deepstream_app_config.txt

(deepstream-app:6565): GLib-GObject-WARNING **: 07:46:40.227: value "TRUE" of type 'gboolean' is invalid or out of range for property 'sync' of type 'gboolean'

(deepstream-app:6565): GLib-GObject-WARNING **: 07:46:40.227: value "TRUE" of type 'gboolean' is invalid or out of range for property 'qos' of type 'gboolean' WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1482 Deserialize engine failed because file path: /opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/model_b1_gpu0_int8.engine open error 0:00:01.268603705 6565 0x562275a22cf0 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1888> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/model_b1_gpu0_int8.engine failed 0:00:01.286617163 6565 0x562275a22cf0 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1993> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/model_b1_gpu0_int8.engine failed, try rebuild 0:00:01.286643910 6565 0x562275a22cf0 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 1]: Trying to create engine from model files WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:659 INT8 calibration file not specified/accessible. INT8 calibration can be done through setDynamicRange API in 'NvDsInferCreateNetwork' implementation

Loading pre-trained weights Loading weights of yolov5s complete Total weights read: 7254397 Building YOLO network

    Layer                         Input Shape         Output Shape        WeightPtr

(0) conv_silu [3, 640, 640] [32, 320, 320] 3584 (1) conv_silu [32, 320, 320] [64, 160, 160] 22272 (2) conv_silu [64, 160, 160] [32, 160, 160] 24448 (3) route: 1 - [64, 160, 160] - (4) conv_silu [64, 160, 160] [32, 160, 160] 26624 (5) conv_silu [32, 160, 160] [32, 160, 160] 27776 (6) conv_silu [32, 160, 160] [32, 160, 160] 37120 (7) shortcut_add_linear: 4 [32, 160, 160] [32, 160, 160] - (8) route: 7, 2 - [64, 160, 160] - (9) conv_silu [64, 160, 160] [64, 160, 160] 41472 (10) conv_silu [64, 160, 160] [128, 80, 80] 115712 (11) conv_silu [128, 80, 80] [64, 80, 80] 124160 (12) route: 10 - [128, 80, 80] - (13) conv_silu [128, 80, 80] [64, 80, 80] 132608 (14) conv_silu [64, 80, 80] [64, 80, 80] 136960 (15) conv_silu [64, 80, 80] [64, 80, 80] 174080 (16) shortcut_add_linear: 13 [64, 80, 80] [64, 80, 80] - (17) conv_silu [64, 80, 80] [64, 80, 80] 178432 (18) conv_silu [64, 80, 80] [64, 80, 80] 215552 (19) shortcut_add_linear: 16 [64, 80, 80] [64, 80, 80] - (20) route: 19, 11 - [128, 80, 80] - (21) conv_silu [128, 80, 80] [128, 80, 80] 232448 (22) conv_silu [128, 80, 80] [256, 40, 40] 528384 (23) conv_silu [256, 40, 40] [128, 40, 40] 561664 (24) route: 22 - [256, 40, 40] - (25) conv_silu [256, 40, 40] [128, 40, 40] 594944 (26) conv_silu [128, 40, 40] [128, 40, 40] 611840 (27) conv_silu [128, 40, 40] [128, 40, 40] 759808 (28) shortcut_add_linear: 25 [128, 40, 40] [128, 40, 40] - (29) conv_silu [128, 40, 40] [128, 40, 40] 776704 (30) conv_silu [128, 40, 40] [128, 40, 40] 924672 (31) shortcut_add_linear: 28 [128, 40, 40] [128, 40, 40] - (32) conv_silu [128, 40, 40] [128, 40, 40] 941568 (33) conv_silu [128, 40, 40] [128, 40, 40] 1089536 (34) shortcut_add_linear: 31 [128, 40, 40] [128, 40, 40] - (35) route: 34, 23 - [256, 40, 40] - (36) conv_silu [256, 40, 40] [256, 40, 40] 1156096 (37) conv_silu [256, 40, 40] [512, 20, 20] 2337792 (38) conv_silu [512, 20, 20] [256, 20, 20] 2469888 (39) route: 37 - [512, 20, 20] - (40) conv_silu [512, 20, 20] [256, 20, 20] 2601984 (41) conv_silu [256, 20, 20] [256, 20, 20] 2668544 (42) conv_silu [256, 20, 20] [256, 20, 20] 3259392 (43) shortcut_add_linear: 40 [256, 20, 20] [256, 20, 20] - (44) route: 43, 38 - [512, 20, 20] - (45) conv_silu [512, 20, 20] [512, 20, 20] 3523584 (46) conv_silu [512, 20, 20] [256, 20, 20] 3655680 (47) maxpool [256, 20, 20] [256, 20, 20] - (48) maxpool [256, 20, 20] [256, 20, 20] - (49) maxpool [256, 20, 20] [256, 20, 20] - (50) route: 46, 47, 48, 49 - [1024, 20, 20] - (51) conv_silu [1024, 20, 20] [512, 20, 20] 4182016 (52) conv_silu [512, 20, 20] [256, 20, 20] 4314112 (53) upsample [256, 20, 20] [256, 40, 40] - (54) route: 53, 36 - [512, 40, 40] - (55) conv_silu [512, 40, 40] [128, 40, 40] 4380160 (56) route: 54 - [512, 40, 40] - (57) conv_silu [512, 40, 40] [128, 40, 40] 4446208 (58) conv_silu [128, 40, 40] [128, 40, 40] 4463104 (59) conv_silu [128, 40, 40] [128, 40, 40] 4611072 (60) route: 59, 55 - [256, 40, 40] - (61) conv_silu [256, 40, 40] [256, 40, 40] 4677632 (62) conv_silu [256, 40, 40] [128, 40, 40] 4710912 (63) upsample [128, 40, 40] [128, 80, 80] - (64) route: 63, 21 - [256, 80, 80] - (65) conv_silu [256, 80, 80] [64, 80, 80] 4727552 (66) route: 64 - [256, 80, 80] - (67) conv_silu [256, 80, 80] [64, 80, 80] 4744192 (68) conv_silu [64, 80, 80] [64, 80, 80] 4748544 (69) conv_silu [64, 80, 80] [64, 80, 80] 4785664 (70) route: 69, 65 - [128, 80, 80] - (71) conv_silu [128, 80, 80] [128, 80, 80] 4802560 (72) conv_silu [128, 80, 80] [128, 40, 40] 4950528 (73) route: 72, 62 - [256, 40, 40] - (74) conv_silu [256, 40, 40] [128, 40, 40] 4983808 (75) route: 73 - [256, 40, 40] - (76) conv_silu [256, 40, 40] [128, 40, 40] 5017088 (77) conv_silu [128, 40, 40] [128, 40, 40] 5033984 (78) conv_silu [128, 40, 40] [128, 40, 40] 5181952 (79) route: 78, 74 - [256, 40, 40] - (80) conv_silu [256, 40, 40] [256, 40, 40] 5248512 (81) conv_silu [256, 40, 40] [256, 20, 20] 5839360 (82) route: 81, 52 - [512, 20, 20] - (83) conv_silu [512, 20, 20] [256, 20, 20] 5971456 (84) route: 82 - [512, 20, 20] - (85) conv_silu [512, 20, 20] [256, 20, 20] 6103552 (86) conv_silu [256, 20, 20] [256, 20, 20] 6170112 (87) conv_silu [256, 20, 20] [256, 20, 20] 6760960 (88) route: 87, 83 - [512, 20, 20] - (89) conv_silu [512, 20, 20] [512, 20, 20] 7025152 (90) route: 71 - [128, 80, 80] - (91) conv_logistic [128, 80, 80] [255, 80, 80] 7058047 (92) yolo [255, 80, 80] - - (93) route: 80 - [256, 40, 40] - (94) conv_logistic [256, 40, 40] [255, 40, 40] 7123582 (95) yolo [255, 40, 40] - - (96) route: 89 - [512, 20, 20] - (97) conv_logistic [512, 20, 20] [255, 20, 20] 7254397 (98) yolo [255, 20, 20] - - batched_nms - - -

Output YOLO blob names: yolo_93 yolo_96 yolo_99

Total number of YOLO layers: 272

Building YOLO network complete Building the TensorRT Engine

NOTE: letter_box is set in cfg file, make sure to set maintain-aspect-ratio=1 in config_infer file to get better accuracy

File does not exist: /opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/calib.table Load image: /opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/calibration/000000000724.jpg Progress: 0.1% CUDA failure: no error in file yoloPlugins.cpp at line 252 Aborted (core dumped)

jessie-zhao avatar Aug 13 '22 00:08 jessie-zhao

I did some updates in the repo. Can you test with the new files?

marcoslucianops avatar Aug 15 '22 06:08 marcoslucianops

Thanks, I updated the repo with your suggestion, but still have issue with int8

deepstream-app -c ./deepstream_app_config.txt

(deepstream-app:18840): GLib-GObject-WARNING **: 11:55:13.074: value "TRUE" of type 'gboolean' is invalid or out of range for property 'sync' of type 'gboolean'

(deepstream-app:18840): GLib-GObject-WARNING **: 11:55:13.074: value "TRUE" of type 'gboolean' is invalid or out of range for property 'qos' of type 'gboolean' WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1482 Deserialize engine failed because file path: /opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/model_b1_gpu0_int8.engine open error 0:00:01.251161688 18840 0x5642666b12f0 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1888> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/model_b1_gpu0_int8.engine failed 0:00:01.268393938 18840 0x5642666b12f0 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1993> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/model_b1_gpu0_int8.engine failed, try rebuild 0:00:01.268442145 18840 0x5642666b12f0 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 1]: Trying to create engine from model files WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:659 INT8 calibration file not specified/accessible. INT8 calibration can be done through setDynamicRange API in 'NvDsInferCreateNetwork' implementation

Loading pre-trained weights Loading weights of yolov5s complete Total weights read: 7254397 Building YOLO network

    Layer                         Input Shape         Output Shape        WeightPtr

(0) conv_silu [3, 640, 640] [32, 320, 320] 3584 (1) conv_silu [32, 320, 320] [64, 160, 160] 22272 (2) conv_silu [64, 160, 160] [32, 160, 160] 24448 (3) route: 1 - [64, 160, 160] - (4) conv_silu [64, 160, 160] [32, 160, 160] 26624 (5) conv_silu [32, 160, 160] [32, 160, 160] 27776 (6) conv_silu [32, 160, 160] [32, 160, 160] 37120 (7) shortcut_add_linear: 4 [32, 160, 160] [32, 160, 160] - (8) route: 7, 2 - [64, 160, 160] - (9) conv_silu [64, 160, 160] [64, 160, 160] 41472 (10) conv_silu [64, 160, 160] [128, 80, 80] 115712 (11) conv_silu [128, 80, 80] [64, 80, 80] 124160 (12) route: 10 - [128, 80, 80] - (13) conv_silu [128, 80, 80] [64, 80, 80] 132608 (14) conv_silu [64, 80, 80] [64, 80, 80] 136960 (15) conv_silu [64, 80, 80] [64, 80, 80] 174080 (16) shortcut_add_linear: 13 [64, 80, 80] [64, 80, 80] - (17) conv_silu [64, 80, 80] [64, 80, 80] 178432 (18) conv_silu [64, 80, 80] [64, 80, 80] 215552 (19) shortcut_add_linear: 16 [64, 80, 80] [64, 80, 80] - (20) route: 19, 11 - [128, 80, 80] - (21) conv_silu [128, 80, 80] [128, 80, 80] 232448 (22) conv_silu [128, 80, 80] [256, 40, 40] 528384 (23) conv_silu [256, 40, 40] [128, 40, 40] 561664 (24) route: 22 - [256, 40, 40] - (25) conv_silu [256, 40, 40] [128, 40, 40] 594944 (26) conv_silu [128, 40, 40] [128, 40, 40] 611840 (27) conv_silu [128, 40, 40] [128, 40, 40] 759808 (28) shortcut_add_linear: 25 [128, 40, 40] [128, 40, 40] - (29) conv_silu [128, 40, 40] [128, 40, 40] 776704 (30) conv_silu [128, 40, 40] [128, 40, 40] 924672 (31) shortcut_add_linear: 28 [128, 40, 40] [128, 40, 40] - (32) conv_silu [128, 40, 40] [128, 40, 40] 941568 (33) conv_silu [128, 40, 40] [128, 40, 40] 1089536 (34) shortcut_add_linear: 31 [128, 40, 40] [128, 40, 40] - (35) route: 34, 23 - [256, 40, 40] - (36) conv_silu [256, 40, 40] [256, 40, 40] 1156096 (37) conv_silu [256, 40, 40] [512, 20, 20] 2337792 (38) conv_silu [512, 20, 20] [256, 20, 20] 2469888 (39) route: 37 - [512, 20, 20] - (40) conv_silu [512, 20, 20] [256, 20, 20] 2601984 (41) conv_silu [256, 20, 20] [256, 20, 20] 2668544 (42) conv_silu [256, 20, 20] [256, 20, 20] 3259392 (43) shortcut_add_linear: 40 [256, 20, 20] [256, 20, 20] - (44) route: 43, 38 - [512, 20, 20] - (45) conv_silu [512, 20, 20] [512, 20, 20] 3523584 (46) conv_silu [512, 20, 20] [256, 20, 20] 3655680 (47) maxpool [256, 20, 20] [256, 20, 20] - (48) maxpool [256, 20, 20] [256, 20, 20] - (49) maxpool [256, 20, 20] [256, 20, 20] - (50) route: 46, 47, 48, 49 - [1024, 20, 20] - (51) conv_silu [1024, 20, 20] [512, 20, 20] 4182016 (52) conv_silu [512, 20, 20] [256, 20, 20] 4314112 (53) upsample [256, 20, 20] [256, 40, 40] - (54) route: 53, 36 - [512, 40, 40] - (55) conv_silu [512, 40, 40] [128, 40, 40] 4380160 (56) route: 54 - [512, 40, 40] - (57) conv_silu [512, 40, 40] [128, 40, 40] 4446208 (58) conv_silu [128, 40, 40] [128, 40, 40] 4463104 (59) conv_silu [128, 40, 40] [128, 40, 40] 4611072 (60) route: 59, 55 - [256, 40, 40] - (61) conv_silu [256, 40, 40] [256, 40, 40] 4677632 (62) conv_silu [256, 40, 40] [128, 40, 40] 4710912 (63) upsample [128, 40, 40] [128, 80, 80] - (64) route: 63, 21 - [256, 80, 80] - (65) conv_silu [256, 80, 80] [64, 80, 80] 4727552 (66) route: 64 - [256, 80, 80] - (67) conv_silu [256, 80, 80] [64, 80, 80] 4744192 (68) conv_silu [64, 80, 80] [64, 80, 80] 4748544 (69) conv_silu [64, 80, 80] [64, 80, 80] 4785664 (70) route: 69, 65 - [128, 80, 80] - (71) conv_silu [128, 80, 80] [128, 80, 80] 4802560 (72) conv_silu [128, 80, 80] [128, 40, 40] 4950528 (73) route: 72, 62 - [256, 40, 40] - (74) conv_silu [256, 40, 40] [128, 40, 40] 4983808 (75) route: 73 - [256, 40, 40] - (76) conv_silu [256, 40, 40] [128, 40, 40] 5017088 (77) conv_silu [128, 40, 40] [128, 40, 40] 5033984 (78) conv_silu [128, 40, 40] [128, 40, 40] 5181952 (79) route: 78, 74 - [256, 40, 40] - (80) conv_silu [256, 40, 40] [256, 40, 40] 5248512 (81) conv_silu [256, 40, 40] [256, 20, 20] 5839360 (82) route: 81, 52 - [512, 20, 20] - (83) conv_silu [512, 20, 20] [256, 20, 20] 5971456 (84) route: 82 - [512, 20, 20] - (85) conv_silu [512, 20, 20] [256, 20, 20] 6103552 (86) conv_silu [256, 20, 20] [256, 20, 20] 6170112 (87) conv_silu [256, 20, 20] [256, 20, 20] 6760960 (88) route: 87, 83 - [512, 20, 20] - (89) conv_silu [512, 20, 20] [512, 20, 20] 7025152 (90) route: 71 - [128, 80, 80] - (91) conv_logistic [128, 80, 80] [255, 80, 80] 7058047 (92) yolo [255, 80, 80] - - (93) route: 80 - [256, 40, 40] - (94) conv_logistic [256, 40, 40] [255, 40, 40] 7123582 (95) yolo [255, 40, 40] - - (96) route: 89 - [512, 20, 20] - (97) conv_logistic [512, 20, 20] [255, 20, 20] 7254397 (98) yolo [255, 20, 20] - -

Output YOLO blob names: yolo_93 yolo_96 yolo_99

Total number of YOLO layers: 260

Building YOLO network complete Building the TensorRT Engine

NOTE: letter_box is set in cfg file, make sure to set maintain-aspect-ratio=1 in config_infer file to get better accuracy

File does not exist: /opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/calib.table Load image: /opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/calibration/000000000724.jpg Progress: 0.1% CUDA failure: no error in file yoloPlugins.cpp at line 233 Aborted (core dumped)

I did some updates in the repo. Can you test with the new files?

jessie-zhao avatar Aug 16 '22 03:08 jessie-zhao

Reference

Any suggestion?

jessie-zhao avatar Aug 18 '22 05:08 jessie-zhao

I don't have this GPU available on AWS to test, so it's a bit hard for me to check about this issue. Can you check with another GPU?

marcoslucianops avatar Aug 19 '22 12:08 marcoslucianops

Yes, I used another env and have the same issue

jessie-zhao avatar Aug 19 '22 22:08 jessie-zhao

Yes, I used another env and have the same issue.

Another GPU?

marcoslucianops avatar Aug 22 '22 21:08 marcoslucianops

Yes, another env with new GPU

jessie-zhao avatar Aug 24 '22 06:08 jessie-zhao

Which GPU?

marcoslucianops avatar Aug 25 '22 22:08 marcoslucianops

Which GPU?

A10

jessie-zhao avatar Aug 29 '22 01:08 jessie-zhao

cluster-mode=2 not 4

Doben2001 avatar Sep 18 '22 10:09 Doben2001

As I said, I don't have A10 GPU to test. In my tests, using other GPUs, it's working.

marcoslucianops avatar Sep 23 '22 13:09 marcoslucianops

Seems like didn't create calib.table, so could you please share calib.table to me, so I can try if it can move on to next step?

jessie-zhao avatar Sep 29 '22 01:09 jessie-zhao

The calib.table is creating based on the images you put in the calibration and the inference model. The file is created after the calibration process.

marcoslucianops avatar Oct 17 '22 12:10 marcoslucianops

Seems like didn't create calib.table, so could you please share calib.table to me, so I can try if it can move on to next step?

hello, Do you solved the problem. File does not exist: /opt/nvidia/deepstream/YOLO_V5_INT8/DeepStream-Yolo/calib.table

i also have the same problem.

stepstep123 avatar Nov 16 '22 09:11 stepstep123

@stepstep123

https://github.com/marcoslucianops/DeepStream-Yolo/issues/163#issuecomment-1317375535

marcoslucianops avatar Nov 16 '22 17:11 marcoslucianops

Hi everyone, I'm facing the same error but with YoloV7. Has anyone found how to fix this problem?

mvidela31 avatar Feb 23 '23 00:02 mvidela31