Pixel2Mesh icon indicating copy to clipboard operation
Pixel2Mesh copied to clipboard

InvalidArgumentError when restoring the checkpoint

Open DubiousCactus opened this issue 5 years ago • 6 comments

When I run python demo.py --image utils/examples/plane.png, the script starts by restoring the checkpoint from utils/checkpoint/gcn.ckpt, but right after that, I get this traceback:

Traceback (most recent call last): File "demo.py", line 84, in <module> vert = sess.run(model.output3, feed_dict=feed_dict) File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run run_metadata_ptr) File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run feed_dict_tensor, options, run_metadata) File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run options, run_metadata) File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentErrorModel restored from file: utils/checkpoint/gcn.ckpt : flat indices[117, :] = [3, 7] does not index into param (shape: [7,7,512]). [[Node: graphprojection_1/GatherNd_15 = GatherNd[Tindices=DT_INT32, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](gcn/Squeeze_3, graphprojection_1/stack_15)]] Caused by op u'graphprojection_1/GatherNd_15', defined at: File "demo.py", line 36, in <module> model = GCN(placeholders, logging=True) File "build/bdist.linux-x86_64/egg/pixel2mesh/models.py", line 108, in __init__ self.build() File "build/bdist.linux-x86_64/egg/pixel2mesh/models.py", line 52, in build hidden = layer(self.activations[-1]) File "build/bdist.linux-x86_64/egg/pixel2mesh/layers.py", line 74, in __call__ outputs = self._call(inputs) File "build/bdist.linux-x86_64/egg/pixel2mesh/layers.py", line 223, in _call out4 = self.project(self.img_feat[3], x, y, 512) File "build/bdist.linux-x86_64/egg/pixel2mesh/layers.py", line 235, in project Q22 = tf.gather_nd(img_feat, tf.stack([tf.cast(x2,tf.int32), tf.cast(y2,tf.int32)],1)) File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1338, in gather_nd name=name) File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__ self._traceback = self._graph._extract_stack() # pylint: disable=protected-access InvalidArgumentError (see above for traceback): flat indices[117, :] = [3, 7] does not index into param (shape: [7,7,512]). [[Node: graphprojection_1/GatherNd_15 = GatherNd[Tindices=DT_INT32, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](gcn/Squeeze_3, graphprojection_1/stack_15)]]

What could be causing InvalidArgumentError (see above for traceback): flat indices[117, :] = [3, 7] does not index into param (shape: [7,7,512]). ?

Thanks

DubiousCactus avatar Sep 26 '18 17:09 DubiousCactus

I ran into the same error. It looks like changing the following two lines in layers.py:

x2 = tf.ceil(x)

and

y2 = tf.ceil(y)

into

x2 = tf.minimum(tf.ceil(x), tf.cast(tf.shape(img_feat)[0], tf.float32) - 1)

and

y2 = tf.minimum(tf.ceil(y), tf.cast(tf.shape(img_feat)[1], tf.float32) - 1)

will bypass the error. Not sure if this is the correct solution.

chongyangma avatar Oct 09 '18 04:10 chongyangma

Unfortunately it didn't fix anything for me...

DubiousCactus avatar Oct 10 '18 14:10 DubiousCactus

@M4gicT0 After change the code, have you rerun python setup.py install?

chongyangma avatar Oct 10 '18 15:10 chongyangma

@chongyangma Didn't do that, it works now. Thanks !

DubiousCactus avatar Oct 10 '18 16:10 DubiousCactus

@nywang16 Is that a proper fix ? It should work out of the box.

DubiousCactus avatar Oct 10 '18 16:10 DubiousCactus

Ran into this problem currently. We found that the problem is about the miscalculated indices. Layer dimensions (see api.py or model.py):

56x56 --> indices[0...55] 28x28 --> indices[0..27] 14x14 --> indices[0..13] 7x7 --> indices[0..6]

Lines 248-253 of layers.py:

		h = tf.minimum(tf.maximum(h, 0), 223)
		w = tf.minimum(tf.maximum(w, 0), 223)

		x = h/(224.0/56)
		y = w/(224.0/56)
		out1 = project(self.img_feat[0], x, y, 64)

Calculation:

  • h and w can be values from 0 to 223
  • 223 / (224 / 56) = 55.75
  • tf.ceil(55.75) = 56

So we try to index our range of 0..55 mentioned earlier with 56 and this is the reason for the error.

Fix:

  • 223 / (223 / 55) = 55
  • tf.ceil(55.0) = 55

This has to be done for every x,y calculation before project() call.

Funny thing is this error only occurs if TF is running on CPU. While running on GPU the call returns 0:

https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/gather_nd

Retraining is needed after the fix.

hinnBU avatar Dec 08 '20 07:12 hinnBU