FSA-Net icon indicating copy to clipboard operation
FSA-Net copied to clipboard

Error when using latest Keras version

Open alphinside opened this issue 6 years ago • 4 comments

Hi, when using latest keras version, I encountered some error like this when try to run the demo

`Using TensorFlow backend. /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) /usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /usr/local/lib/python3.6/dist-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:66: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:541: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:541: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4432: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4432: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:197: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:197: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

2019-08-30 07:16:49.590373: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2019-08-30 07:16:49.595010: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1 2019-08-30 07:16:49.673540: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-08-30 07:16:49.673849: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x53e1310 executing computations on platform CUDA. Devices: 2019-08-30 07:16:49.673870: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce GTX 1060 with Max-Q Design, Compute Capability 6.1 2019-08-30 07:16:49.693333: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2208000000 Hz 2019-08-30 07:16:49.695634: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x542afe0 executing computations on platform Host. Devices: 2019-08-30 07:16:49.695701: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): , 2019-08-30 07:16:49.696117: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-08-30 07:16:49.697607: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: GeForce GTX 1060 with Max-Q Design major: 6 minor: 1 memoryClockRate(GHz): 1.48 pciBusID: 0000:01:00.0 2019-08-30 07:16:49.698281: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 2019-08-30 07:16:49.701125: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0 2019-08-30 07:16:49.701867: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0 2019-08-30 07:16:49.702089: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0 2019-08-30 07:16:49.703098: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0 2019-08-30 07:16:49.703873: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0 2019-08-30 07:16:49.706541: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2019-08-30 07:16:49.706656: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-08-30 07:16:49.707002: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-08-30 07:16:49.707215: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2019-08-30 07:16:49.707257: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 2019-08-30 07:16:49.707789: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-08-30 07:16:49.707800: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 2019-08-30 07:16:49.707808: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N 2019-08-30 07:16:49.707890: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-08-30 07:16:49.708258: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-08-30 07:16:49.709155: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5097 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1) WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:2041: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:2041: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4271: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4271: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4267: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4267: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

WARNING:tensorflow:From /root/FSA-Net/demo/capsulelayers.py:155: calling softmax (from tensorflow.python.ops.nn_ops) with dim is deprecated and will be removed in a future version. Instructions for updating: dim is deprecated, use axis instead WARNING:tensorflow:From /root/FSA-Net/demo/capsulelayers.py:155: calling softmax (from tensorflow.python.ops.nn_ops) with dim is deprecated and will be removed in a future version. Instructions for updating: dim is deprecated, use axis instead Traceback (most recent call last): File "demo_FSANET.py", line 197, in main() File "demo_FSANET.py", line 122, in main stage_num, lambda_d, S_set)() File "/root/FSA-Net/demo/FSANET_model.py", line 844, in call ssr_Cap_model = ssr_Cap_model_build((self.num_primcaps,64)) File "/root/FSA-Net/demo/FSANET_model.py", line 823, in ssr_Cap_model_build capsule = CapsuleLayer(self.num_capsule, self.dim_capsule, self.routings, name='caps')(input_primcaps) File "/usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py", line 451, in call output = self.call(inputs, **kwargs) File "/root/FSA-Net/demo/capsulelayers.py", line 170, in call b += K.batch_dot(outputs, inputs_hat, [2, 3]) File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 1261, in batch_dot 'y.shape[%d] (%d != %d).' % (axes[0], axes[1], d1, d2)) ValueError: Can not do batch_dot on inputs with shapes (None, 3, 3, 21, 16) and (None, 3, 21, 21, 16) with axes=[2, 3]. x.shape[2] != y.shape[3] (3 != 21).`

alphinside avatar Aug 30 '19 07:08 alphinside

Hello. I will try to fix later. In the mean time, please use keras 2.2.4 and tensorflow 1.10.0. Please check the SSD demo which is the current best demo version. Thanks

shamangary avatar Aug 30 '19 07:08 shamangary

@shamangary
I fix above error by reinstall keras 2.2.4, but then i encountered following error

019-09-02 17:30:36.424764: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0 2019-09-02 17:30:36.614356: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2019-09-02 17:30:37.568972: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-09-02 17:30:37.584388: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-09-02 17:30:37.595557: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-09-02 17:30:37.597083: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-09-02 17:30:37.598581: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-09-02 17:30:37.599950: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-09-02 17:30:37.601378: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-09-02 17:30:37.602724: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-09-02 17:30:37.604112: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-09-02 17:30:37.605472: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-09-02 17:30:37.606859: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-09-02 17:30:37.608223: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR Traceback (most recent call last): File "demo_FSANET_ssd.py", line 216, in main() File "demo_FSANET_ssd.py", line 203, in main input_img = draw_results_ssd(detected,input_img,faces,ad,img_size,img_w,img_h,model,time_detection,time_network,time_plot) File "demo_FSANET_ssd.py", line 79, in draw_results_ssd p_result = model.predict(face) File "/home/haha/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1169, in predict steps=steps) File "/home/haha/.local/lib/python3.6/site-packages/keras/engine/training_arrays.py", line 294, in predict_loop batch_outs = f(ins_batch) File "/home/haha/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in call return self._call(inputs) File "/home/haha/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call fetched = self._callable_fn(*array_vals) File "/home/haha/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1458, in call run_metadata_ptr) tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found. (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[{{node model_3/ssr_G_model/separable_conv2d_29/separable_conv2d}}]] [[average_1/truediv/_3413]] (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[{{node model_3/ssr_G_model/separable_conv2d_29/separable_conv2d}}]] 0 successful operations. 0 derived errors ignored.

any idea about this? thanks

HansonSun avatar Sep 02 '19 10:09 HansonSun

Please set the environment into the suggested ones to avoid the problem.

python                    3.5.6                hc3d631a_0  
keras-applications        1.0.4                    py35_1    anaconda
keras-base                2.1.0                    py35_0    anaconda
keras-gpu                 2.1.0                         0    anaconda
keras-preprocessing       1.0.2                    py35_1    anaconda
tensorflow                1.10.0          mkl_py35heddcb22_0  
tensorflow-base           1.10.0          mkl_py35h3c3e929_0  
tensorflow-gpu            1.10.0               hf154084_0    anaconda
cudnn                     7.1.3                 cuda8.0_0  
cuda80                    1.0                           0    soumith
numpy                     1.15.2          py35_blas_openblashd3ea46f_0  [blas_openblas]  conda-forge
numpy-base                1.14.3           py35h2b20989_0  

shamangary avatar Sep 03 '19 03:09 shamangary

In case of CUDNN_STATUS_INTERNAL_ERROR, please check again the GPU Memory is avaliable or not by: nvidia-smi

If not, then you could insert these lines of code to set the gpu memory growth correctly:

from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
set_session(tf.Session(config=config))

leviethung2103 avatar Nov 04 '19 10:11 leviethung2103