pose detection works, but not with hands and/or face - Ubuntu 20.04 GPU
Issue Summary
I've successfully installed and re-installed openPose on two brand-new Ubuntu 20.04 systems, but I cant get hand or face detection to work.
If I only detect the pose (default), then the demo runs perfectly and produces keypoint json file output.
But when I run with -hand or -face, I get this error: Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0) CUDNN_STATUS_NOT_INITIALIZED
Type of Issue
- Compilation/installation error
- Execution error
Errors
When compiling, I get about a dozen of these warnings:
CUDNN_STATUS_VERSION_MISMATCH
(see below for more details on these warnings)
I'm working on an remote EC2 instance (no display), so I'm using these options when running:
-display 0 --render_pose 0 --write_json
System Configuration
AWS g4dn.4xlarge EC2 instance with NVIDIA Tesla T4 8GB GPU (no display) Ubuntu 20.04 CUDA 11.7. (11.7.99) cuDNN: ver. 8.5.0 gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 cmake: 3.18.0
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Jun__8_16:49:14_PDT_2022 Cuda compilation tools, release 11.7, V11.7.99 Build cuda_11.7.r11.7/compiler.31442593_0
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 20C P8 9W / 70W | 2MiB / 15360MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
I had to download the pose_25, hand, and face models from alternative sources (DropBox or Google Drive, I can't recall).
Compile Issues
When I compiled, I had 11 or 12 of these CUDNN_STATUS_VERSION_MISMATCH warnings, all from various components of caffe:
21 | switch (status) {
| ^
In file included from /home/ubuntu/ben-dev/openpose/3rdparty/caffe/include/caffe/util/device_alternate.hpp:40,
from /home/ubuntu/ben-dev/openpose/3rdparty/caffe/include/caffe/common.hpp:19,
from /home/ubuntu/ben-dev/openpose/3rdparty/caffe/include/caffe/blob.hpp:8,
from /home/ubuntu/ben-dev/openpose/3rdparty/caffe/include/caffe/caffe.hpp:7,
from /home/ubuntu/ben-dev/openpose/3rdparty/caffe/examples/cpp_classification/classification.cpp:1:
/home/ubuntu/ben-dev/openpose/3rdparty/caffe/include/caffe/util/cudnn.hpp: In function ‘const char* cudnnGetErrorString(cudnnStatus_t)’:
/home/ubuntu/ben-dev/openpose/3rdparty/caffe/include/caffe/util/cudnn.hpp:21:10:warning: enumeration value ‘CUDNN_STATUS_VERSION_MISMATCH’ not handled in switch [-Wswitch]
Executed Command
build/examples/openpose/openpose.bin -image_dir examples/media/ -display 0 --render_pose 0 --write_json ../docker_work/data/ -hand -hand_render 0
OpenPose Output
Starting OpenPose demo...
Configuring OpenPose...
Starting thread(s)...
Auto-detecting all available GPUs... Detected 1 GPU(s), using 1 of them starting at GPU 0.
F1206 18:25:25.215843 9825 cudnn_relu_layer.cpp:13] Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0) CUDNN_STATUS_NOT_INITIALIZED
*** Check failure stack trace: ***
@ 0x7f86961821c3 google::LogMessage::Fail()
@ 0x7f869618725b google::LogMessage::SendToLog()
@ 0x7f8696181ebf google::LogMessage::Flush()
@ 0x7f86961826ef google::LogMessageFatal::~LogMessageFatal()
@ 0x7f8695db9120 caffe::CuDNNReLULayer<>::LayerSetUp()
@ 0x7f8695e8de8d caffe::Net<>::Init()
@ 0x7f8695e903a5 caffe::Net<>::Net()
@ 0x7f86968145fc op::NetCaffe::initializationOnThread()
@ 0x7f86967f9488 op::HandExtractorCaffe::netInitializationOnThread()
@ 0x7f86967faae5 op::HandExtractorNet::initializationOnThread()
@ 0x7f869685a4f7 op::Worker<>::initializationOnThreadNoException()
@ 0x7f869685a648 op::SubThread<>::initializationOnThread()
@ 0x7f869685be58 op::Thread<>::initializationOnThread()
@ 0x7f869685f42c op::Thread<>::threadFunction()
@ 0x7f869649ddf4 (unknown)
@ 0x7f8695bdb609 start_thread
@ 0x7f86962d9133 clone
Aborted (core dumped)
OpenPose version: Latest GitHub code
Caffe version: Default from OpenPose
OpenCV version: pre-compiled apt-get install libopencv-dev
I have reviewed a number of issues that seem related: Google colab helper script #949 CUDNN_STATUS_VERSION_MISMATCH #2225 (no solution found so far)
could you please share how did you installed openpose? i tried based on the install from source. when i do the compilation, make -jnproc, it needs boost 1.54, but i can't install 1.54boost successfully. Thanks!
Hello, it seems by the name of your directory that you may be using Docker - you can find Dockerfiles with cuDNN properly initialized in Issue #2290