tensorflow-xnor-bnn icon indicating copy to clipboard operation
tensorflow-xnor-bnn copied to clipboard

Error when both --binary --xnor are set

Open wjtan99 opened this issue 6 years ago • 3 comments

Hi, I ran a test with both --binary and --xnor set. Here are the errors,

2018-04-26 10:34:44.400066: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 2018-04-26 10:34:44.400072: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y 2018-04-26 10:34:44.400080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:01:00.0) Traceback (most recent call last): File "/home/ubuntu/anaconda2/envs/tf-xnor-bnn/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call return fn(*args) File "/home/ubuntu/anaconda2/envs/tf-xnor-bnn/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn status, run_metadata) File "/home/ubuntu/anaconda2/envs/tf-xnor-bnn/lib/python3.6/contextlib.py", line 89, in exit next(self.gen) File "/home/ubuntu/anaconda2/envs/tf-xnor-bnn/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [128,512], In[1]: [512,512] [[Node: fc2_b/Gemm = Gemm[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](fc1_b/Sign, fc2_b/Sign)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "mnist_fc_bnn.py", line 154, in x: batch_xs, y_: batch_ys, keep_prob: args.keep_prob, phase: BN_TRAIN_PHASE}) File "/home/ubuntu/anaconda2/envs/tf-xnor-bnn/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run run_metadata_ptr) File "/home/ubuntu/anaconda2/envs/tf-xnor-bnn/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run feed_dict_string, options, run_metadata) File "/home/ubuntu/anaconda2/envs/tf-xnor-bnn/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run target_list, options, run_metadata) File "/home/ubuntu/anaconda2/envs/tf-xnor-bnn/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [128,512], In[1]: [512,512] [[Node: fc2_b/Gemm = Gemm[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](fc1_b/Sign, fc2_b/Sign)]]

Caused by op 'fc2_b/Gemm', defined at: File "mnist_fc_bnn.py", line 86, in keep_prob, x, batch_norm, phase) File "../models/binary_net.py", line 21, in init self.dense_layers(batch_norm, first, last, phase) File "../models/binary_net.py", line 85, in dense_layers fc2 = tf.nn.dropout(xnor_gemm(fc1, Wb_2), self.keep_prob) File "", line 30, in gemm File "/home/ubuntu/anaconda2/envs/tf-xnor-bnn/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) File "/home/ubuntu/anaconda2/envs/tf-xnor-bnn/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op original_op=self._default_original_op, op_def=op_def) File "/home/ubuntu/anaconda2/envs/tf-xnor-bnn/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in init self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Matrix size-incompatible: In[0]: [128,512], In[1]: [512,512] [[Node: fc2_b/Gemm = Gemm[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](fc1_b/Sign, fc2_b/Sign)]]

Do you have an idea how to fix it?
Thanks a lot

wjtan99 avatar Apr 26 '18 17:04 wjtan99

It's a known bug at the moment that the matrices have to be square, just haven't had time to push a fix.

AngusG avatar Apr 26 '18 17:04 AngusG

thanks for you quick reply. Can you push the fix ASAP or email the fix to me at [email protected]?

wjtan99 avatar Apr 26 '18 20:04 wjtan99

It seems the mnist_conv_bnn.py works fine with a few simple changes. I asked a question in another Issue but I closed it. It is about a confirmation of what you implement in this repo. My understanding this is a binaryNet with xnor GEMM so it speeds up even on a PC. Is that right?
Thanks.

wjtan99 avatar Apr 26 '18 21:04 wjtan99