text-to-image
text-to-image copied to clipboard
Error from batch_norm
I got this error when I was trying to run your scripts.
Traceback (most recent call last):
File "train.py", line 238, in <module>
main()
File "train.py", line 76, in main
input_tensors, variables, loss, outputs, checks = gan.build_model()
File "/home/akara/Workspace/text-to-image/model.py", line 44, in build_model
disc_wrong_image, disc_wrong_image_logits = self.discriminator(t_wrong_image, t_real_caption, reuse = True)
File "/home/akara/Workspace/text-to-image/model.py", line 165, in discriminator
h1 = ops.lrelu( self.d_bn1(ops.conv2d(h0, self.options['df_dim']*2, name = 'd_h1_conv'))) #16
File "/home/akara/Workspace/text-to-image/Utils/ops.py", line 34, in __call__
ema_apply_op = self.ema.apply([batch_mean, batch_var])
File "/home/akara/miniconda2/envs/gan/lib/python2.7/site-packages/tensorflow/python/training/moving_averages.py", line 391, in apply
self._averages[var], var, decay, zero_debias=zero_debias))
File "/home/akara/miniconda2/envs/gan/lib/python2.7/site-packages/tensorflow/python/training/moving_averages.py", line 70, in assign_moving_average
update_delta = _zero_debias(variable, value, decay)
File "/home/akara/miniconda2/envs/gan/lib/python2.7/site-packages/tensorflow/python/training/moving_averages.py", line 177, in _zero_debias
trainable=False)
File "/home/akara/miniconda2/envs/gan/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1024, in get_variable
custom_getter=custom_getter)
File "/home/akara/miniconda2/envs/gan/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 850, in get_variable
custom_getter=custom_getter)
File "/home/akara/miniconda2/envs/gan/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 346, in get_variable
validate_shape=validate_shape)
File "/home/akara/miniconda2/envs/gan/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 331, in _true_getter
caching_device=caching_device, validate_shape=validate_shape)
File "/home/akara/miniconda2/envs/gan/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 650, in _get_single_variable
"VarScope?" % name)
ValueError: Variable d_bn1/d_bn1_2/d_bn1_2/moments/moments_1/mean/ExponentialMovingAverage/biased does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?
It was when the script is trying to create another discriminator.
disc_real_image, disc_real_image_logits = self.discriminator(t_real_image, t_real_caption)
disc_wrong_image, disc_wrong_image_logits = self.discriminator(t_wrong_image, t_real_caption, reuse = True) # Here
disc_fake_image, disc_fake_image_logits = self.discriminator(fake_image, t_real_caption, reuse = True)
I printed all variables but it seems to initialize with different variable names, but the reuse = True
.
same problem
Is there any solution for this issue? @csbkwang @akaraspt @paarthneekhara
What tensorflow version are you using? IIRC the code ran on version r0.10. I don't have access to a machine to debug the code right now.
I used tensorflow 1.0. Thanks Paarth @paarthneekhara
@jiang2764 So did it work?
I got the same error when i want to run the train code. That's why I asked you and others. Thanks. @paarthneekhara
Hi, This is a compatibility issue with the tf update. Replace the batch_norm class code in ops.py by the one written here https://github.com/iamaaditya/DCGAN-tensorflow/blob/master/ops.py . This should fix the issue.
I actually add the ops.py to replace the batch_norm.However, it still exists another problem: Variable d_h0_conv/w/Adam/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope? How can I do to solve the problem?Thanks!@paarthneekhara
At last I solved the problem! There were two ways that we need to solve it.First, we should add the ops.py.Second,we should add with tf.variable_scope(tf.get_variable_scope()) to our code. Thanks everyone!
I also got stuck in this problem and solved it in another way. My tensorflow version is '0.12.1'. I replace the batch_norm class code in ops.py. with the code from https://github.com/Hanock/generating_images_part_by_part/blob/master/code/lib/ops.py. I modify the init function(remove the parameter "batch_size") and it finally works.
@Duke-Wyh thanks, but where to put tf.variable_scope(tf.get_variable_scope())?
@jiang2764 Did you solve this problem? I have the same problem. I used tensorflow 1.0.1.
@zhhezhhe Please follow @paarthneekhara 's suggestion, update the ops file, and then modify the argument format for function tf.nn.sigmoid_cross_entropy_with_logits. The training process should work. Thanks @paarthneekhara ! I am running the training process now. I stopped working on this after I asked the question. Now it is time to go for this.
@OwalnutO , @jiang2764 if the method worked for you, can you please submit a pull request with the patch for the same?
this may help https://github.com/YearnyeenHo/text-to-image .
where to put tf.variable_scope(tf.get_variable_scope())? @Duke-Wyh
Using https://github.com/YearnyeenHo/text-to-image, I still have this problem in tensorflow1.3. Variable d_h0_conv/w/Adam/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope? How to solve? Thank you@ zhhezhhe
Hi @SpadesQ , were you able to find a solution to this? I am facing the same issue.
On replacing ops file, problem with Adam comes while training. If trying to use checkpoint, otFoundError (see above for traceback): Tensor name "d_bn1/moving_mean" not found in checkpoint files Data/Models/latest_model_flowers_temp.ckpt
@paarthneekhara
When i try to generate images using the pre trained model, Even i get the following error.
NotFoundError (see above for traceback): Tensor name "d_bn1/moving_mean" not found in checkpoint files Data/Models/latest_model_flowers_temp.ckpt
I am using the code from here https://github.com/YearnyeenHo/text-to-image and have the downloaded the checkpoint file from the link given.
Please suggest a solution.
@paarthneekhara Thanks for writing this code. I have the same problem as above. I'm running the latest release of each lib needed, but this one stumped me. Is there a good solution that makes this work? All the dialog above is a bit hodgepodge. I'd like to see your solution please.