caffe-segnet icon indicating copy to clipboard operation
caffe-segnet copied to clipboard

Check failed: status == CUDNN_STATUS_SUCCESS (9 vs. 0) CUDNN_STATUS_NOT_SUPPORTED

Open lhoangan opened this issue 8 years ago • 3 comments

Hi,

I got this error while running make runtest on caffe-segnet. I have switched to CUDNN v2 after this https://github.com/alexgkendall/caffe-segnet/issues/12

Do you have any idea what could cause this problem?

[----------] 6 tests from CuDNNConvolutionLayerTest/0, where TypeParam = float [ RUN ] CuDNNConvolutionLayerTest/0.TestSimpleConvolutionGroupCuDNN F0802 16:20:11.039242 31344 cudnn_conv_layer.cu:85] Check failed: status == CUDNN_STATUS_SUCCESS (9 vs. 0) CUDNN_STATUS_NOT_SUPPORTED *** Check failure stack trace: *** @ 0x2b2f9a134daa (unknown) @ 0x2b2f9a134ce4 (unknown) @ 0x2b2f9a1346e6 (unknown) @ 0x2b2f9a137687 (unknown) @ 0x2b2f9bb82e19 caffe::CuDNNConvolutionLayer<>::Forward_gpu() @ 0x45f6b7 caffe::Layer<>::Forward() @ 0x7715c9 caffe::CuDNNConvolutionLayerTest_TestSimpleConvolutionGroupCuDNN_Test<>::TestBody() @ 0x7c0bb3 testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x7b77f7 testing::Test::Run() @ 0x7b789e testing::TestInfo::Run() @ 0x7b79a5 testing::TestCase::Run() @ 0x7bace8 testing::internal::UnitTestImpl::RunAllTests() @ 0x7baf77 testing::UnitTest::Run() @ 0x451d8a main @ 0x2b2f9c7abf45 (unknown) @ 0x457ed9 (unknown) @ (nil) (unknown) make: *** [runtest] Aborted

lhoangan avatar Aug 02 '16 15:08 lhoangan

@lhoangan Hi, I encounter the same problem when using tf.keras.utils.multi_gpu_model (tf1.13.1,python3.6,cuda10,cudnn7) to train my model on multi-GPUs, while single gpu works fine. Do you find out the reason and solved the problem? thanks ~

tensorflow/stream_executor/cuda/cuda_dnn.cc:503] Check failed: cudnnSetTensorNdDescriptor(handle_.get(), elem_type, nd, dims.data(), strides.data()) == CUDNN_STATUS_SUCCESS (9 vs. 0)batch_descriptor: {count: 1536 feature_map_count: 64 spatial: 208 208 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}

zheng-yuwei avatar Nov 08 '19 02:11 zheng-yuwei

Hi @zheng-yuwei I'm sorry that I no longer have clear recall of this event, I doubt that was because of unsupported cuDNN version, but I might be wrong. Sorry for the bummer.

lhoangan avatar Nov 08 '19 18:11 lhoangan

@lhoangan thanks~ I change to another server, and it works fine now although I still dont know why. And I think you may right~

zheng-yuwei avatar Nov 11 '19 07:11 zheng-yuwei