SPADE-Tensorflow
SPADE-Tensorflow copied to clipboard
Unable to run pretrained celebA hinge checkpoint
Hi, I am unable to reproduce prediction or training using existing celebA hinge checkpoint.
Here is the stack trace of calling random test with pretrained checkpoint:
python main.py --dataset spade_celebA --segmap_ch 3 --phase random
Everything is runs okay till reading the checkpoints. But then some shapes are reported as mismatched and I am not able to figure out what could the problem be. I am using TensorFlow 1.14. Do I need to use an older TF version for this to work or are there some additional code modifications that were performed after saving the pretrained checkpoint?
UPDATE: I resolved mismatching shapes lhs shape= [5,5,16,128] rhs shape= [5,5,19,128] as described below. Now new mismatch is lhs shape= [32768,256] rhs shape= [8192,256]
[*] Reading checkpoints... W1017 12:37:21.855408 139936319534848 deprecation.py:323] From /home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. Traceback (most recent call last): File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call return fn(*args) File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [5,5,16,128] rhs shape= [5,5,19,128] [[{{node save/Assign_623}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1286, in restore {self.saver_def.filename_tensor_name: save_path}) File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 950, in run run_metadata_ptr) File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1173, in _run feed_dict_tensor, options, run_metadata) File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1350, in _do_run run_metadata) File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1370, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [5,5,16,128] rhs shape= [5,5,19,128] [[node save/Assign_623 (defined at /home/blaz/github/SPADE-Tensorflow/SPADE.py:537) ]]
Errors may have originated from an input operation. Input Source operations connected to node save/Assign_623: generator/spade_resblock_fix_2/spade_2/conv_128/conv2d/kernel/Adam (defined at /home/blaz/github/SPADE-Tensorflow/SPADE.py:383)
Original stack trace for 'save/Assign_623':
File "main.py", line 125, in
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 125, in
Assign requires shapes of both tensors to match. lhs shape= [5,5,16,128] rhs shape= [5,5,19,128] [[node save/Assign_623 (defined at /home/blaz/github/SPADE-Tensorflow/SPADE.py:537) ]]
Errors may have originated from an input operation. Input Source operations connected to node save/Assign_623: generator/spade_resblock_fix_2/spade_2/conv_128/conv2d/kernel/Adam (defined at /home/blaz/github/SPADE-Tensorflow/SPADE.py:383)
Original stack trace for 'save/Assign_623':
File "main.py", line 125, in
UPDATE: I resolved this issue by reproducing CelebAHQ masks with 19 segmentation labels (instead of 16 labels as originally defined in spade_celebA\segmap_label.txt).
Now I get the following error (with mismatched dimensions 32768 and 8192):
[*] Reading checkpoints... W1017 13:40:56.285303 139846562793216 deprecation.py:323] From /home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. Traceback (most recent call last): File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call return fn(*args) File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [32768,256] rhs shape= [8192,256] [[{{node save/Assign_185}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1286, in restore {self.saver_def.filename_tensor_name: save_path}) File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 950, in run run_metadata_ptr) File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1173, in _run feed_dict_tensor, options, run_metadata) File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1350, in _do_run run_metadata) File "/home/blaz/anaconda2/envs/tf1.14/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1370, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [32768,256] rhs shape= [8192,256] [[node save/Assign_185 (defined at /home/blaz/github/SPADE-Tensorflow/SPADE.py:539) ]]
Errors may have originated from an input operation. Input Source operations connected to node save/Assign_185: encoder/linear_var/kernel/Adam_1 (defined at /home/blaz/github/SPADE-Tensorflow/SPADE.py:385)
Original stack trace for 'save/Assign_185':
File "main.py", line 125, in
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 125, in
Assign requires shapes of both tensors to match. lhs shape= [32768,256] rhs shape= [8192,256] [[node save/Assign_185 (defined at /home/blaz/github/SPADE-Tensorflow/SPADE.py:539) ]]
Errors may have originated from an input operation. Input Source operations connected to node save/Assign_185: encoder/linear_var/kernel/Adam_1 (defined at /home/blaz/github/SPADE-Tensorflow/SPADE.py:385)
Original stack trace for 'save/Assign_185':
File "main.py", line 125, in
Any ideas?
Thank you!
you can change the contents of the segmap_label.txt to {(0, 0, 0): 0, (0, 0, 255): 1, (255, 0, 0): 2, (150, 30, 150): 3, (255, 65, 255): 4, (150, 80, 0): 5, (170, 120, 65): 6, (125, 125, 125): 7, (255, 255, 0): 8, (0, 255, 255): 9, (255, 150, 0): 10, (255, 225, 120): 11, (255, 125, 125): 12, (200, 100, 100): 13, (0, 255, 0): 14, (0, 150, 80): 15, (215, 175, 125): 16, (220, 180, 210): 17, (125, 125, 255): 18} The reason is that the author source code only contains 16 classes.
you can change the contents of the segmap_label.txt to {(0, 0, 0): 0, (0, 0, 255): 1, (255, 0, 0): 2, (150, 30, 150): 3, (255, 65, 255): 4, (150, 80, 0): 5, (170, 120, 65): 6, (125, 125, 125): 7, (255, 255, 0): 8, (0, 255, 255): 9, (255, 150, 0): 10, (255, 225, 120): 11, (255, 125, 125): 12, (200, 100, 100): 13, (0, 255, 0): 14, (0, 150, 80): 15, (215, 175, 125): 16, (220, 180, 210): 17, (125, 125, 255): 18} The reason is that the author source code only contains 16 classes.
Hi, thank you for your reply!
I already did that, as described in my update:
UPDATE: I resolved this issue by reproducing CelebAHQ masks with 19 segmentation labels (instead of 16 labels as originally defined in spade_celebA\segmap_label.txt).
Now I get the following error (with mismatched dimensions 32768 and 8192):
[*] Reading checkpoints... W1017 13:40:56.285303 139846562793216 deprecation.py:323] From ...
This is why I am not sure what else could it be.
when you change the segmap_label.txt. you can use: Random test
python main.py --dataset spade_celebA --segmap_ch 3 --phase random Guide test python main.py --dataset spade_celebA --img_ch 3 --segmap_ch 3 --phase guide --guide_img ./guide_img.png
but when you train the model. The segmap_label.txt is should Automatically created). I can't have this problem(with mismatched dimensions 32768 and 8192). But you can see if the size of the data set matches, and the size of the image.
Have you solved this, I encounter the same problem about loading the pre-trained checkpoing.