darknet icon indicating copy to clipboard operation
darknet copied to clipboard

Release version of darknet call crashes

Open ukoehler opened this issue 8 months ago • 0 comments

When building an up-to-date version of darknet in release mode, the execution crashes. The run slow, but without crashes under valgrind or when build as debug version.

Command used:

./darknet detector test cfg/coco.data cfg/yolov4-csp-x-swish.cfg pretrained/yolov4-csp-x-swish.weights -thresh 0.25

Models downloaded as per instruction in the readme.

Platform

$cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Installed the latest version of cuda.

$gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Building from Makefile with the following settings

GPU=1
CUDNN=1
CUDNN_HALF=1
OPENCV=1
AVX=1
OPENMP=1
LIBSO=0
ZED_CAMERA=0
ZED_CAMERA_v2_8=0

# set GPU=1 and CUDNN=1 to speedup on GPU
# set CUDNN_HALF=1 to further speedup 3 x times (Mixed-precision on Tensor Cores) GPU: Volta, Xavier, Turing, Ampere, Ada and higher
# set AVX=1 and OPENMP=1 to speedup on CPU (if error occurs then set AVX=0)
# set ZED_CAMERA=1 to enable ZED SDK 3.0 and above
# set ZED_CAMERA_v2_8=1 to enable ZED SDK 2.X

USE_CPP=0
DEBUG=0

leads to the following error

 CUDA-version: 12020 (12020), cuDNN: 8.9.5, CUDNN_HALF=1, GPU count: 1
 CUDNN_HALF=1
 OpenCV version: 4.5.4
Illegal instruction (core dumped)

Without CUDNN:

GPU=1
CUDNN=0
CUDNN_HALF=0
OPENCV=1
AVX=1
OPENMP=1
LIBSO=0
ZED_CAMERA=0
ZED_CAMERA_v2_8=0

# set GPU=1 and CUDNN=1 to speedup on GPU
# set CUDNN_HALF=1 to further speedup 3 x times (Mixed-precision on Tensor Cores) GPU: Volta, Xavier, Turing, Ampere, Ada and higher
# set AVX=1 and OPENMP=1 to speedup on CPU (if error occurs then set AVX=0)
# set ZED_CAMERA=1 to enable ZED SDK 3.0 and above
# set ZED_CAMERA_v2_8=1 to enable ZED SDK 2.X

USE_CPP=0
DEBUG=0

lead to an earlier crash

 CUDA-version: 12020 (12020), GPU count: 1
 OpenCV version: 4.5.4
Illegal instruction (core dumped)

A debug build works fine:

GPU=1
CUDNN=1
CUDNN_HALF=1
OPENCV=1
AVX=1
OPENMP=1
LIBSO=0
ZED_CAMERA=0
ZED_CAMERA_v2_8=0

# set GPU=1 and CUDNN=1 to speedup on GPU
# set CUDNN_HALF=1 to further speedup 3 x times (Mixed-precision on Tensor Cores) GPU: Volta, Xavier, Turing, Ampere, Ada and higher
# set AVX=1 and OPENMP=1 to speedup on CPU (if error occurs then set AVX=0)
# set ZED_CAMERA=1 to enable ZED SDK 3.0 and above
# set ZED_CAMERA_v2_8=1 to enable ZED SDK 2.X

USE_CPP=0
DEBUG=1

leads to:

 DEBUG=1
 CUDA-version: 12020 (12020), cuDNN: 8.9.5, CUDNN_HALF=1, GPU count: 1
 CUDNN_HALF=1
 OpenCV version: 4.5.4d
 0 : compute_capability = 610, cudnn_half = 0, GPU: NVIDIA TITAN Xp
net.optimized_memory = 0
mini_batch = 1, batch = 8, time_steps = 1, train = 0
   layer   filters  size/strd(dil)      input                output
   0 Create CUDA-stream - 0
 Create cudnn-handle 0
conv     32       3 x 3/ 1    640 x 640 x   3 ->  640 x 640 x  32 0.708 BF
   1 conv     80       3 x 3/ 2    640 x 640 x  32 ->  320 x 320 x  80 4.719 BF
...
 203 yolo
[yolo] params: iou loss: ciou (4), iou_norm: 0.05, obj_norm: 0.40, cls_norm: 0.50, delta_norm: 1.00, scale_x_y: 2.00
nms_kind: diounms (2), beta = 0.600000
Total BFLOPS 221.986
avg_outputs = 1015760
 Allocate additional workspace_size = 81.93 MB
Loading weights from pretrained/yolov4-csp-x-swish.weights...
 seen 64, trained: 0 K-images (0 Kilo-batches_64)
Done! Loaded 204 layers from weights-file
Enter Image Path: test/4.png
 Detection layer: 195 - type = 28
 Detection layer: 199 - type = 28
 Detection layer: 203 - type = 28
test/4.png: Predicted in 136.340000 milli-seconds.
person: 91%
chair: 69%
...

ukoehler avatar Oct 09 '23 11:10 ukoehler