pix2pix-tensorflow Applying Wasserstein (Earth Mover) distance to the project

Hi...

I was wondering if adding Wasserstein distance to the code would help the GAN to stabilize and give better results... from what I've read, it has really nice properties and implementing it shouldn't be a problem since it only needs a couple of changes (described here). here's the link to the Github page of the paper : https://github.com/martinarjovsky/WassersteinGAN

I tried implementing it in Pix2Pix but it seems the whole network just ignores the new Wasserstein distance and only focuses on the MSE L2 loss thus giving averaged useless outputs... I'm pretty sure the problem lies in my implementation (as always!).

Feb 11 '17 09:02 Neltherion

Yes, I think Wasserstein distance would help! Do you wanna make a PR?

Feb 11 '17 09:02 yenchenlin

But my code is now vastly different than the master code... I downloaded it and renamed/changed/removed nearly every part of the code...

if you want I can paste the changes (according to the guide) here as they should be less than 10 lines of changes in code...

Feb 11 '17 10:02 Neltherion

When I replace it with Wassertstein distance, the result will become worse... I wonder if it is because of the clipping parameter, I just set is -0.05 and 0.05 as WGAN paper said. My d_loss is very big, even more than 80.

Mar 07 '17 09:03 Fuckmi

Exactly! Wasserstein seems to not work properly with Conditional GANs... I've seen this behavior also when using L2 Loss (LSGAN).

Mar 07 '17 09:03 Neltherion

Interesting. Could you make a PR to discuss your implementation?

Mar 07 '17 09:03 nightrome

I'm sorry but I'm no longer using this repository... I tested Wasserstein (and LSGAN) on it using this guide (and this one) but then moved on to the AffineLayer implementation... I no longer have the code for this repository...

But implementing it is pretty straightforward and requires just a couple of changes...

Mar 07 '17 10:03 Neltherion

I have changed the linear connect layer in the discriminate layer, and modify the filter size from 5x5 to 4x4, the result become much better, almost the same with AffineLayer implementation. @Neltherion But I hope to know why the conditional LSGAN or conditional WGAN not work well?

Mar 09 '17 09:03 Fuckmi

I've been looking in the web for someone else who may have implemented a conditional LSGAN or WGAN to check if I've made mistakes with my own implementation but nevertheless I haven't found one yet... could you please also change your code to use Wasserstein or L2 loss and see if it would produce useful results? I've tested my outputs in colorizing scenarios but the output after 2 days of training is pure garbage!

Mar 09 '17 15:03 Neltherion

@Neltherion did you use adam or rmsprop, the original wgan paper uses rmsprop with smaller than usual learning rate

Mar 12 '17 05:03 yanji84

I used Adam...

Mar 12 '17 06:03 Neltherion

these are from the paper to apply wgan properly

remove sigmoid from D output and log from loss function
clip weights of D to be from -0.01 to 0.01 ( note, clip the weights, not the gradients )
train D a couple more times than G ( paper suggests 5:1 ratio )
use RMSProp, not adam with a suggested learning rate of 0.00005

Mar 12 '17 17:03 yanji84

I did all of the above except for using RMSProp... If I get the time, I'll try doing it again with RMSProp...

Mar 13 '17 08:03 Neltherion

Not using RMSProp will likely cause your discriminator to collapse at some point. The gradients appear to get very stale after a while and ADAM will at some point likely overshoot the minima and lead to massive fluctuation in discriminator loss.

Mar 27 '17 18:03 Skylion007

FYI: Conditional WGAN

Aug 04 '17 00:08 rafaelvalle

@Fuckmi Could u tell me that how did you change the linear connect layer in the discriminate layer???

Feb 20 '18 01:02 fengxin619

I found something that is not quite clear. Let take example on facades data: with the function: def load_image(image_path): input_img = imread(image_path) w = int(input_img.shape[1]) w2 = int(w/2) img_A = input_img[:, 0:w2] img_B = input_img[:, w2:w] return img_A, img_B

Then they are concatenated in order AB by: img_AB = np.concatenate((img_A, img_B), axis=2)

so the model is about generating the realistic image from A to B (left to right in the image.). However, in "build_model(self)", A and B is flipped to each other by:

self.real_B = self.real_data[:, :, :, :self.input_c_dim] self.real_A = self.real_data[:, :, :, self.input_c_dim:self.input_c_dim + self.output_c_dim]

My understanding is real_B is now the input imageA. So the calculation of "tf.reduce_mean(tf.abs(self.real_B - self.fake_B))" does not make any sense.

Bests,

Oct 18 '18 16:10 32nguyen

pix2pix-tensorflow pix2pix-tensorflow copied to clipboard

Applying Wasserstein (Earth Mover) distance to the project

pix2pix-tensorflow
pix2pix-tensorflow copied to clipboard