MUNIT-Tensorflow icon indicating copy to clipboard operation
MUNIT-Tensorflow copied to clipboard

Perceptual Loss missing

Open Cuky88 opened this issue 6 years ago • 8 comments

Thanks for your great work @taki0112

I'm curious why you didn't implement the perceptual loss, is there a special reason?

Cheers.

Cuky88 avatar Jun 18 '18 13:06 Cuky88

To make the code more simpler In original MUNIT, use the pretrained_vgg16_lua_version

So to load this model, we need the load_lua function in pytorch I want to write the code only using tensorflow, so I did't implement the perceptual loss

However, I will also make it possible to do with tensorflow.

Thanks

taki0112 avatar Jun 19 '18 07:06 taki0112

Hi, thanks for the fast reply.

I saw that and I'm already working on a solution in tf only.

Cuky88 avatar Jun 19 '18 07:06 Cuky88

Is it possible to PR?

taki0112 avatar Jun 19 '18 07:06 taki0112

Yes, I'll send you a PR when I'm finished

Cuky88 avatar Jun 19 '18 07:06 Cuky88

Thanks a lot.

taki0112 avatar Jun 19 '18 07:06 taki0112

@taki0112 Hi, I think I managed to get perceptual loss to work, but I'm not 100% sure.

Unfortunately I cannot create a PR, since I changed a couple of other things before. Please look at this commit. There is everything for perceptual loss. You can find the download links for the vgg16 weight files in vgg16.py on top in comment section. You can also look in this config file to see which values the arguments for the vgg part has. I only tested everything with the .h5 weight files.

Another issue would be to implement LPIPS Distance also in TF. Do you have such intentions?

It would be very good if you could look over the code, I'm not an expert in TF like you :)

EDIT: this commit is also needed.

Cuky88 avatar Jun 20 '18 15:06 Cuky88

@Cuky88 Hi. I think that you have a bug in the "vgg_preprocess" method in "ops.py". You subtract means from the image, but the range of the image is -1: 1, and your values for the byte representation. And, as far as I understand, VGG-16 requires an unregulated input: line #263 ops.py: channels[i] -= means[i] should look like: channels[i] = (channels[i] + 1.0) * 127.5 - means[i]

Сorrect me if I'm wrong.

MartinMeliss avatar Jul 05 '18 16:07 MartinMeliss

@MartinMeliss You are right, image floats are scaled between -1 and 1. So the vgg preprocessing is skrewing up everything. Thanks for pointing out, will fix soon.

Cuky88 avatar Jul 06 '18 08:07 Cuky88