openpose icon indicating copy to clipboard operation
openpose copied to clipboard

pose detection works, but not with hands and/or face - Ubuntu 20.04 GPU

Open benrubin opened this issue 2 years ago • 3 comments

Issue Summary

I've successfully installed and re-installed openPose on two brand-new Ubuntu 20.04 systems, but I cant get hand or face detection to work.

If I only detect the pose (default), then the demo runs perfectly and produces keypoint json file output.

But when I run with -hand or -face, I get this error: Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0) CUDNN_STATUS_NOT_INITIALIZED

Type of Issue

  • Compilation/installation error
  • Execution error

Errors

When compiling, I get about a dozen of these warnings: CUDNN_STATUS_VERSION_MISMATCH (see below for more details on these warnings)

I'm working on an remote EC2 instance (no display), so I'm using these options when running: -display 0 --render_pose 0 --write_json

System Configuration

AWS g4dn.4xlarge EC2 instance with NVIDIA Tesla T4 8GB GPU (no display) Ubuntu 20.04 CUDA 11.7. (11.7.99) cuDNN: ver. 8.5.0 gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 cmake: 3.18.0

nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Jun__8_16:49:14_PDT_2022 Cuda compilation tools, release 11.7, V11.7.99 Build cuda_11.7.r11.7/compiler.31442593_0

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:1E.0 Off |                    0 |
| N/A   20C    P8     9W /  70W |      2MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

I had to download the pose_25, hand, and face models from alternative sources (DropBox or Google Drive, I can't recall).

Compile Issues

When I compiled, I had 11 or 12 of these CUDNN_STATUS_VERSION_MISMATCH warnings, all from various components of caffe:

   21 |   switch (status) {
      |          ^
In file included from /home/ubuntu/ben-dev/openpose/3rdparty/caffe/include/caffe/util/device_alternate.hpp:40,
                 from /home/ubuntu/ben-dev/openpose/3rdparty/caffe/include/caffe/common.hpp:19,
                 from /home/ubuntu/ben-dev/openpose/3rdparty/caffe/include/caffe/blob.hpp:8,
                 from /home/ubuntu/ben-dev/openpose/3rdparty/caffe/include/caffe/caffe.hpp:7,
                 from /home/ubuntu/ben-dev/openpose/3rdparty/caffe/examples/cpp_classification/classification.cpp:1:
/home/ubuntu/ben-dev/openpose/3rdparty/caffe/include/caffe/util/cudnn.hpp: In function ‘const char* cudnnGetErrorString(cudnnStatus_t)’:
/home/ubuntu/ben-dev/openpose/3rdparty/caffe/include/caffe/util/cudnn.hpp:21:10:warning: enumeration value ‘CUDNN_STATUS_VERSION_MISMATCH’ not handled in switch [-Wswitch]

Executed Command

build/examples/openpose/openpose.bin -image_dir examples/media/ -display 0 --render_pose 0 --write_json ../docker_work/data/ -hand -hand_render 0

OpenPose Output

Starting OpenPose demo...
Configuring OpenPose...
Starting thread(s)...
Auto-detecting all available GPUs... Detected 1 GPU(s), using 1 of them starting at GPU 0.
F1206 18:25:25.215843  9825 cudnn_relu_layer.cpp:13] Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0)  CUDNN_STATUS_NOT_INITIALIZED
*** Check failure stack trace: ***
    @     0x7f86961821c3  google::LogMessage::Fail()
    @     0x7f869618725b  google::LogMessage::SendToLog()
    @     0x7f8696181ebf  google::LogMessage::Flush()
    @     0x7f86961826ef  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f8695db9120  caffe::CuDNNReLULayer<>::LayerSetUp()
    @     0x7f8695e8de8d  caffe::Net<>::Init()
    @     0x7f8695e903a5  caffe::Net<>::Net()
    @     0x7f86968145fc  op::NetCaffe::initializationOnThread()
    @     0x7f86967f9488  op::HandExtractorCaffe::netInitializationOnThread()
    @     0x7f86967faae5  op::HandExtractorNet::initializationOnThread()
    @     0x7f869685a4f7  op::Worker<>::initializationOnThreadNoException()
    @     0x7f869685a648  op::SubThread<>::initializationOnThread()
    @     0x7f869685be58  op::Thread<>::initializationOnThread()
    @     0x7f869685f42c  op::Thread<>::threadFunction()
    @     0x7f869649ddf4  (unknown)
    @     0x7f8695bdb609  start_thread
    @     0x7f86962d9133  clone
Aborted (core dumped)

OpenPose version: Latest GitHub code Caffe version: Default from OpenPose OpenCV version: pre-compiled apt-get install libopencv-dev

I have reviewed a number of issues that seem related: Google colab helper script #949 CUDNN_STATUS_VERSION_MISMATCH #2225 (no solution found so far)

benrubin avatar Dec 06 '23 22:12 benrubin

could you please share how did you installed openpose? i tried based on the install from source. when i do the compilation, make -jnproc, it needs boost 1.54, but i can't install 1.54boost successfully. Thanks!

wenxie18 avatar Feb 20 '24 02:02 wenxie18

Hello, it seems by the name of your directory that you may be using Docker - you can find Dockerfiles with cuDNN properly initialized in Issue #2290

hiibolt avatar Apr 04 '24 04:04 hiibolt