pix2pix-tensorflow
pix2pix-tensorflow copied to clipboard
Applying Wasserstein (Earth Mover) distance to the project
Hi...
I was wondering if adding Wasserstein distance to the code would help the GAN to stabilize and give better results... from what I've read, it has really nice properties and implementing it shouldn't be a problem since it only needs a couple of changes (described here). here's the link to the Github page of the paper : https://github.com/martinarjovsky/WassersteinGAN
I tried implementing it in Pix2Pix but it seems the whole network just ignores the new Wasserstein distance and only focuses on the MSE L2 loss thus giving averaged useless outputs... I'm pretty sure the problem lies in my implementation (as always!).
Yes, I think Wasserstein distance would help! Do you wanna make a PR?
But my code is now vastly different than the master code... I downloaded it and renamed/changed/removed nearly every part of the code...
if you want I can paste the changes (according to the guide) here as they should be less than 10 lines of changes in code...
When I replace it with Wassertstein distance, the result will become worse... I wonder if it is because of the clipping parameter, I just set is -0.05 and 0.05 as WGAN paper said. My d_loss is very big, even more than 80.
Exactly! Wasserstein seems to not work properly with Conditional GANs... I've seen this behavior also when using L2 Loss (LSGAN).
Interesting. Could you make a PR to discuss your implementation?
I'm sorry but I'm no longer using this repository... I tested Wasserstein (and LSGAN) on it using this guide (and this one) but then moved on to the AffineLayer implementation... I no longer have the code for this repository...
But implementing it is pretty straightforward and requires just a couple of changes...
I have changed the linear connect layer in the discriminate layer, and modify the filter size from 5x5 to 4x4, the result become much better, almost the same with AffineLayer implementation. @Neltherion But I hope to know why the conditional LSGAN or conditional WGAN not work well?
I've been looking in the web for someone else who may have implemented a conditional LSGAN or WGAN to check if I've made mistakes with my own implementation but nevertheless I haven't found one yet... could you please also change your code to use Wasserstein or L2 loss and see if it would produce useful results? I've tested my outputs in colorizing scenarios but the output after 2 days of training is pure garbage!
@Neltherion did you use adam or rmsprop, the original wgan paper uses rmsprop with smaller than usual learning rate
I used Adam...
these are from the paper to apply wgan properly
- remove sigmoid from D output and log from loss function
- clip weights of D to be from -0.01 to 0.01 ( note, clip the weights, not the gradients )
- train D a couple more times than G ( paper suggests 5:1 ratio )
- use RMSProp, not adam with a suggested learning rate of 0.00005
I did all of the above except for using RMSProp... If I get the time, I'll try doing it again with RMSProp...
Not using RMSProp will likely cause your discriminator to collapse at some point. The gradients appear to get very stale after a while and ADAM will at some point likely overshoot the minima and lead to massive fluctuation in discriminator loss.
FYI: Conditional WGAN
@Fuckmi Could u tell me that how did you change the linear connect layer in the discriminate layer???
I found something that is not quite clear. Let take example on facades data:
with the function:
def load_image(image_path):
input_img = imread(image_path)
w = int(input_img.shape[1])
w2 = int(w/2)
img_A = input_img[:, 0:w2]
img_B = input_img[:, w2:w]
return img_A, img_B
Then they are concatenated in order AB by:
img_AB = np.concatenate((img_A, img_B), axis=2)
so the model is about generating the realistic image from A to B (left to right in the image.). However, in "build_model(self)", A and B is flipped to each other by:
self.real_B = self.real_data[:, :, :, :self.input_c_dim]
self.real_A = self.real_data[:, :, :, self.input_c_dim:self.input_c_dim + self.output_c_dim]
My understanding is real_B is now the input imageA. So the calculation of "tf.reduce_mean(tf.abs(self.real_B - self.fake_B))" does not make any sense.
Bests,