py-faster-rcnn icon indicating copy to clipboard operation
py-faster-rcnn copied to clipboard

Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM

Open nyanmn opened this issue 7 years ago • 30 comments

Installation is fine. During the installation, I had one issue with cudnn5.1 and I followed the suggestion here and now installation is fine.

Now I test the demo code as ./tools/demo.py

Then I have error as


I1117 09:48:41.011925 12503 net.cpp:51] Initializing net from parameters: 
name: "VGG_ILSVRC_16_layers"
state {
  phase: TEST
  level: 0
}
.
.
.
layer {
  name: "cls_prob"
  type: "Softmax"
  bottom: "cls_score"
  top: "cls_prob"
}
I1117 09:48:41.012234 12503 layer_factory.hpp:77] Creating layer input
I1117 09:48:41.012251 12503 net.cpp:84] Creating Layer input
I1117 09:48:41.012259 12503 net.cpp:380] input -> data
I1117 09:48:41.012271 12503 net.cpp:380] input -> im_info
I1117 09:48:41.328574 12503 net.cpp:122] Setting up input
I1117 09:48:41.328608 12503 net.cpp:129] Top shape: 1 3 224 224 (150528)
I1117 09:48:41.328614 12503 net.cpp:129] Top shape: 1 3 (3)
I1117 09:48:41.328618 12503 net.cpp:137] Memory required for data: 602124
I1117 09:48:41.328624 12503 layer_factory.hpp:77] Creating layer conv1_1
I1117 09:48:41.328655 12503 net.cpp:84] Creating Layer conv1_1
I1117 09:48:41.328660 12503 net.cpp:406] conv1_1 <- data
I1117 09:48:41.328670 12503 net.cpp:380] conv1_1 -> conv1_1
F1117 09:48:41.676553 12503 cudnn.hpp:128] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0)  CUDNN_STATUS_BAD_PARAM
*** Check failure stack trace: ***
Aborted (core dumped)

What is wrong with my installation for this faster rcnn?

I have cuda8.0 and libcudnn5_5.1.10-1+cuda8.0 is installed on Ubuntu16.04. I have Qurdo K4200 graphic card.

nyanmn avatar Nov 17 '17 02:11 nyanmn

I encountered the same problem, and I use cuda8.0 & cudnn5.1.10 & Ubuntu14.04 & 1080Ti. Is there anybody can give us solutions?

I have tried this and it doesn't work. https://github.com/rbgirshick/py-faster-rcnn/issues/237

cechung avatar Nov 20 '17 05:11 cechung

@EricccChung I also have the same problem, and if you've solved this problem, can you tell me how to solve it? thank you very much.

Huangswust182 avatar Nov 20 '17 11:11 Huangswust182

I have this solution, please have a look.

https://stackoverflow.com/questions/47342267/check-failed-status-cudnn-status-success-3-vs-0-cudnn-status-bad-param-fo

On Mon, Nov 20, 2017 at 1:12 PM, EricccChung [email protected] wrote:

I encountered the same problem, and I use cuda8.0 & cudnn5.1.10 & Ubuntu14.04 & 1080Ti. Is there anybody can give us solutions?

I have tried this and it doesn't work. #237 https://github.com/rbgirshick/py-faster-rcnn/issues/237

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rbgirshick/py-faster-rcnn/issues/733#issuecomment-345593549, or mute the thread https://github.com/notifications/unsubscribe-auth/AOJcY6XJ-G8ICwSYPK7AR_1VdwsErs3Vks5s4Qo1gaJpZM4QhdrO .

nyanmn avatar Nov 20 '17 12:11 nyanmn

I have this solution, please have a look.

https://stackoverflow.com/questions/47342267/check-failed-status-cudnn-status-success-3-vs-0-cudnn-status-bad-param-fo

On Mon, Nov 20, 2017 at 7:48 PM, Huangswust182 [email protected] wrote:

@EricccChung https://github.com/ericccchung I also have the same problem, and if you've solved this problem, can you tell me how to solve it? thank you very much.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rbgirshick/py-faster-rcnn/issues/733#issuecomment-345673206, or mute the thread https://github.com/notifications/unsubscribe-auth/AOJcYzPAYzkLatoQKeD__Xtrwx60NOUwks5s4WcfgaJpZM4QhdrO .

nyanmn avatar Nov 20 '17 12:11 nyanmn

@nyanmn Thank you for your answer,but caffe-fast-rcnn branch doesn't support cundnnV6?

Huangswust182 avatar Nov 20 '17 12:11 Huangswust182

Not sure, but I can use with v6.

Sent from my iPhone

On 20 Nov 2017, at 8:26 PM, Huangswust182 [email protected] wrote:

@nyanmn Thank you for your answer,but caffe-fast-rcnn branch doesn't support cundnnV6?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

nyanmn avatar Nov 20 '17 23:11 nyanmn

You will need to replace with main trunk of caffe, I did that.

Sent from my iPhone

On 20 Nov 2017, at 8:26 PM, Huangswust182 [email protected] wrote:

@nyanmn Thank you for your answer,but caffe-fast-rcnn branch doesn't support cundnnV6?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

nyanmn avatar Nov 21 '17 00:11 nyanmn

@nyanmn You mean that replace whole folder(caffe-fast-rcnn) at $FRCN_ROOT with folder downloaded from caffe github?

cechung avatar Nov 21 '17 04:11 cechung

I did this first

cd caffe-fast-rcnn git remote add caffe https://github.com/BVLC/caffe.git git fetch caffe git merge -X theirs caffe/master

Remove self_.attr("phase") = static_cast(this->phase_); from include/caffe/layers/python_layer.hpp after merging.

Then upgrade to cudnnv6.0 for CUDA8.0, it worked for me.

thanks

On Tue, Nov 21, 2017 at 12:56 PM, EricccChung [email protected] wrote:

@nyanmn https://github.com/nyanmn You mean that replace whole folder(caffe-fast-rcnn) at $FRCN_ROOT with folder downloaded from caffe github?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rbgirshick/py-faster-rcnn/issues/733#issuecomment-345916875, or mute the thread https://github.com/notifications/unsubscribe-auth/AOJcY4RpoMPFIu4SCBAKtcrxcF6ee6d1ks5s4lfsgaJpZM4QhdrO .

nyanmn avatar Nov 21 '17 05:11 nyanmn

OK I will try it. I will reply if it does work. Thanks a lot!

cechung avatar Nov 21 '17 05:11 cechung

Please refer to my link for detail https://stackoverflow.com/questions/47342267/check-failed-status-cudnn-status-success-3-vs-0-cudnn-status-bad-param-fo pls vote if it works.

On Tue, Nov 21, 2017 at 1:10 PM, EricccChung [email protected] wrote:

OK I will try it. I will reply if it does work. Thanks a lot!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rbgirshick/py-faster-rcnn/issues/733#issuecomment-345918741, or mute the thread https://github.com/notifications/unsubscribe-auth/AOJcY8fHYu0pwCwnedBMVydPtOe_SXnmks5s4ls1gaJpZM4QhdrO .

nyanmn avatar Nov 21 '17 05:11 nyanmn

@nyanmn Hello, I can't use cudnnv6+cuda8.0 to compile caffe-fast-rcnn, and I've made a lot of mistakes, for example Make: * * [.build_release/src/caffe/syncedmem.o] Error 1, etc. Is there anything else that needs to be changed?

Huangswust182 avatar Nov 23 '17 07:11 Huangswust182

i have just solved my problem by adding engine: CAFFE in the convolution_param you can try this

Jingwei-Liao avatar Dec 06 '17 14:12 Jingwei-Liao

@woshiljw Hi,I want to know which document I should put in "engine: CAFFE" ? thanks!!!

napolun279 avatar Dec 07 '17 02:12 napolun279

@napolun279 in the discription file of net you should add "engine: CAFFE" in "convolution_param" which belong to Convolution layer (my english is not so good hope you can understand)

Jingwei-Liao avatar Dec 07 '17 04:12 Jingwei-Liao

@woshiljw
thank you, I will have a try, and then I will give you a reply!

napolun279 avatar Dec 07 '17 04:12 napolun279

@napolun279 you should add this in every Convolution layer

Jingwei-Liao avatar Dec 07 '17 04:12 Jingwei-Liao

@woshiljw
ok, thanks very much for your enthusiastic answer!!!

napolun279 avatar Dec 07 '17 04:12 napolun279

@napolun279 you are welcome,hope you can also help more people~

Jingwei-Liao avatar Dec 07 '17 04:12 Jingwei-Liao

@nyanmn Hi, I have tried your solution, but it doesn't work. The error was changed from F1212 22:23:30.527931 21847 cudnn_relu_layer.cu:24] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM to Segmentation fault (core dumped).

trikim avatar Dec 12 '17 14:12 trikim

That could be some other issue. That is relu layer issue.

Sent from my iPhone

On 12 Dec 2017, at 10:49 PM, trikim [email protected] wrote:

@nyanmn Hi, I have tried your solution, but it doesn't work. The error was changed from F1212 22:23:30.527931 21847 cudnn_relu_layer.cu:24] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM to Segmentation fault (core dumped).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

nyanmn avatar Dec 12 '17 15:12 nyanmn

By reference from http://blog.csdn.net/u012841667/article/details/53436615, I changed all the cudnn* files, with newest caffe project from https://github.com/BVLC/caffe. the cudnn* files are in /py-faster-rcnn/caffe-fast-rcnn/include/caffe/util/ , /py-faster-rcnn/caffe-fast-rcnn/include/caffe/layers/, /py-faster-rcnn/caffe-fast-rcnn/src/caffe/util/ and /py-faster-rcnn/caffe-fast-rcnn/src/caffe/layers/ my GPU is 940MX, memory is 2G. So the problem was changed to: Check failed: error == cudaSuccess (2 vs. 0) out of memory At last, I changed to use CPU mode. And it worked well by the command: ./tools/demo.py --cpu

trikim avatar Dec 13 '17 03:12 trikim

I think you need to change your gpu to test faster rcnn. You can check gpu requirements in the website. I think need more than 2G. I don't remember exactly

Sent from my iPhone

On 13 Dec 2017, at 11:03 AM, trikim [email protected] wrote:

By reference from http://blog.csdn.net/u012841667/article/details/53436615, I changed all the cudnn* files, with newest caffe project from https://github.com/BVLC/caffe. the cudnn* files are in /py-faster-rcnn/caffe-fast-rcnn/include/caffe/util/ , /py-faster-rcnn/caffe-fast-rcnn/include/caffe/layers/, /py-faster-rcnn/caffe-fast-rcnn/src/caffe/util/ and /py-faster-rcnn/caffe-fast-rcnn/src/caffe/layers/ my GPU is 940MX, memory is 2G. So the problem was changed to: Check failed: error == cudaSuccess (2 vs. 0) out of memory At last, I changed to use CPU mode. And it worked well by the command: ./tools/demo.py --cpu

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

nyanmn avatar Dec 13 '17 06:12 nyanmn

@napolun279 @EricccChung @Huangswust182 @woshiljw @nyanmn my problem is when I run the demo.py Check failed: status == CUDNN_STATUS_SUCCESS (4 vs. 0) CUDNN_STATUS_INTERNAL_ERROR *** Check failure stack trace: *** Aborted i don't know how to solve it for several days!! I have changed all the cudnn* files, and the nvidia message is Sat Dec 23 15:54:13 2017
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 384.90 Driver Version: 384.90 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 106... Off | 00000000:01:00.0 On | N/A | | 49% 31C P5 8W / 120W | 264MiB / 6071MiB | 2% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1149 G /usr/lib/xorg/Xorg 199MiB | | 0 1884 G cinnamon 62MiB | +-----------------------------------------------------------------------------+

tongpinmo avatar Dec 23 '17 08:12 tongpinmo

Was that initial attempt running as a test run? That sounds familiar to other issues I've come across.

Brad-Robbins avatar Jan 03 '18 10:01 Brad-Robbins

To solve this problem you have to use CuDNN..But there is a issue related to CuDNN v5 or v5.1. Downgrade to v4 and rebuild caffe..I hope that ths issue should be solved..

Thank you.

amlandas78 avatar May 02 '18 10:05 amlandas78

@woshiljw i've solved!!!! it finally works!!!! thank you!!

yjyjkim avatar Jun 22 '18 07:06 yjyjkim

i have just solved my problem by adding engine: CAFFE in the convolution_param you can try this

It worked for me, wondering what does this parameter conveys ?

shriyashchougule avatar Sep 21 '18 05:09 shriyashchougule

@woshiljw It also worked for me. Thanks a lot. Happy new year!

rongduo avatar Feb 03 '19 15:02 rongduo

The codes written by others can run successfully on my compute but the codes by myself can't. When I run my scripts this error turns.so i think there's something wrong in my codes rather than in the cudnn or cuda. but i don't know where the problem is.

sljlp avatar Sep 12 '19 12:09 sljlp