pose-tensorflow icon indicating copy to clipboard operation
pose-tensorflow copied to clipboard

cuDNN launch failure

Open Tianyu97 opened this issue 5 years ago • 1 comments

When I run the Single-Person demo code, it shows a problem. The cuda is 9.0.176 and cudnn is 7.4.2.24

(tensorflow) bfs@zty2:~/pose-tensorflow-master$ TF_CUDNN_USE_AUTOTUNE=0 python3 demo/singleperson.py 2019-04-14 12:04:06.515875: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2019-04-14 12:04:08.638915: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties: name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.6325 pciBusID: 0000:04:00.0 totalMemory: 10.92GiB freeMemory: 10.32GiB 2019-04-14 12:04:08.638992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:04:00.0, compute capability: 6.1) 2019-04-14 12:04:14.960564: E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7402 (compatibility version 7400) but source was compiled with 7004 (compatibility version 7000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration. 2019-04-14 12:04:14.962004: W ./tensorflow/stream_executor/stream.h:1988] attempting to perform DNN operation using StreamExecutor without DNN support Traceback (most recent call last): File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _do_call return fn(*args) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1329, in _run_fn status, run_metadata) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InternalError: cuDNN launch failure : input shape([1,3,518,280]) filter shape([7,7,3,64]) [[Node: resnet_v1_101/conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_101/Pad, resnet_v1_101/conv1/weights/read)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "demo/singleperson.py", line 26, in outputs_np = sess.run(outputs, feed_dict={inputs: image_batch}) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 895, in run run_metadata_ptr) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1128, in _run feed_dict_tensor, options, run_metadata) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run options, run_metadata) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InternalError: cuDNN launch failure : input shape([1,3,518,280]) filter shape([7,7,3,64]) [[Node: resnet_v1_101/conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_101/Pad, resnet_v1_101/conv1/weights/read)]]

Caused by op 'resnet_v1_101/conv1/Conv2D', defined at: File "demo/singleperson.py", line 17, in sess, inputs, outputs = predict.setup_pose_prediction(cfg) File "demo/../nnet/predict.py", line 11, in setup_pose_prediction outputs = pose_net(cfg).test(inputs) File "demo/../nnet/pose_net.py", line 89, in test heads = self.get_net(inputs) File "demo/../nnet/pose_net.py", line 85, in get_net net, end_points = self.extract_features(inputs) File "demo/../nnet/pose_net.py", line 55, in extract_features net, end_points = net_fun(im_centered, global_pool=False, output_stride=16, is_training=False) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 300, in resnet_v1_101 scope=scope) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v1.py", line 205, in resnet_v1 net = resnet_utils.conv2d_same(net, 64, 7, stride=2, scope='conv1') File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/nets/resnet_utils.py", line 146, in conv2d_same scope=scope) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args return func(*args, **current_args) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1057, in convolution outputs = layer.apply(inputs) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 762, in apply return self.call(inputs, *args, **kwargs) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 652, in call outputs = self.call(inputs, *args, **kwargs) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/layers/convolutional.py", line 167, in call outputs = self._convolution_op(inputs, self.kernel) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 838, in call return self.conv_op(inp, filter) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 502, in call return self.call(inp, filter) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 190, in call name=self.name) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 639, in conv2d data_format=data_format, dilations=dilations, name=name) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op op_def=op_def) File "/home/bfs/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1625, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InternalError (see above for traceback): cuDNN launch failure : input shape([1,3,518,280]) filter shape([7,7,3,64]) [[Node: resnet_v1_101/conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](resnet_v1_101/Pad, resnet_v1_101/conv1/weights/read)]]

Tianyu97 avatar Apr 14 '19 12:04 Tianyu97

Could you please tell me the version of your cuda, cudnn and tensorflow?

Tianyu97 avatar Apr 14 '19 12:04 Tianyu97