deep-photo-enhancer icon indicating copy to clipboard operation
deep-photo-enhancer copied to clipboard

Solvd some issues on the implementation

Open MrRobot2211 opened this issue 4 years ago • 25 comments

Hi I have solved some issues with the implementation and got it working properly with 2-way gan . Take a look and let's see if we can merge efforts https://github.com/MrRobot2211/pytorch-deep-photo-enhancer

MrRobot2211 avatar Apr 21 '20 18:04 MrRobot2211

Hi I have solved some issues with the implementation and got it working properly with 2-way gan . Take a look and let's see if we can merge efforts https://github.com/MrRobot2211/pytorch-deep-photo-enhancer

Thanks a lot! Could you please show the result and losses? For now,I have used it to train 155 epochs, but the result is awful, and D loss is about -400000000, G loss is about -150000000.

mtics avatar Apr 22 '20 00:04 mtics

Hi I am testing it on a reduced set I am geting

[Epoch 300/300] [Batch 2/2] [D loss: -8273.702557] [G loss: 1007.653687]

Loss loss: 0.003262 PSNR Avg: 24.864868 Loss loss: 0.003182 PSNR Avg: 24.919203 Final PSNR Avg: 24.919203

And the results are as expected, I think the psnr is more infomative than the actual losses. In any case I am taining on a reduced set. Can you upload your train/test data (as they are in the directories) somewhere ? so I can download it and certify that we are actually doing the same.

MrRobot2211 avatar Apr 22 '20 03:04 MrRobot2211

Hi I am testing it on a reduced set I am geting

[Epoch 300/300] [Batch 2/2] [D loss: -8273.702557] [G loss: 1007.653687]

Loss loss: 0.003262 PSNR Avg: 24.864868 Loss loss: 0.003182 PSNR Avg: 24.919203 Final PSNR Avg: 24.919203

And the results are as expected, I think the psnr is more infomative than the actual losses. In any case I am taining on a reduced set. Can you upload your train/test data (as they are in the directories) somewhere ? so I can download it and certify that we are actually doing the same.

when learning_rate was 1e-3, I got these: [Epoch 229/300] [Batch 9/29] [D loss: -560770361.040283] [G loss: -275414976.000000]

so I changed it to 1e-5, then got these: [Epoch 300/300] [Batch 29/29] [D loss: -173917.736206] [G loss: -56673.386719]

PSNR Avg: 12.403313 PSNR Avg: 11.698874 PSNR Avg: 11.964977 PSNR Avg: 11.624313 PSNR Avg: 11.389729 PSNR Avg: 11.009916 PSNR Avg: 11.019134 Final PSNR Avg: 11.019134

I upload them on dropbox. It contains train_checkpoints, train_images, constant.py, and the dataset I used (in mini.tar)

mtics avatar Apr 22 '20 04:04 mtics

Great thank you I'll try to run it as soon as possible so that we can compare. In any case you have some differences with the original code, which may be related to the bad performance, the main one being they use instancenorm instead of batchnorm.

MrRobot2211 avatar Apr 22 '20 04:04 MrRobot2211

Great thank you I'll try to run it as soon as possible so that we can compare. In any case you have some differences with the original code, which may be related to the bad performance, the main one being they use instancenorm instead of batchnorm.

I used your code to train, maybe there is something wrong in my dataset or hyperparameters?

when lr = 1e-3, the result is really awful...

mtics avatar Apr 22 '20 04:04 mtics

Yes you should set lr to 1e-5 . At 1e-3 the losses fall nre ally fast but the psnr does not improve. Try pulling it now. I just made another commit

MrRobot2211 avatar Apr 22 '20 04:04 MrRobot2211

Yes you should set lr to 1e-5 . At 1e-3 the losses fall nre ally fast but the psnr does not improve. Try pulling it now. I just made another commit

I used my mini dataset, and have trained 60 epochs, but the d loss is -20,000, and G loss is -3,000.

is there something wrong in my dataset?

mtics avatar Apr 22 '20 04:04 mtics

HAve you set lr to 1e-5 ? Run it 300 epochs and then check the quality of the results, and the psnr.

MrRobot2211 avatar Apr 22 '20 04:04 MrRobot2211

HAve you set lr to 1e-5 ? Run it 300 epochs and then check the quality of the results, and the psnr.

yes, i set it. my dataset may take a long time. For now, it just trained 105 epochs. I will wait until it is finished.

mtics avatar Apr 22 '20 05:04 mtics

We may have to revise batch size, afterwards. I am also running it.

MrRobot2211 avatar Apr 22 '20 05:04 MrRobot2211

We may have to revise batch size, afterwards. I am also running it.

Could you upload your dataset? I want to compare them to see if there is any thing wrong in my dataset.

mtics avatar Apr 22 '20 05:04 mtics

Yes, when this run end I'll upload it . In any case with your dataset and a batch size of 6 I am geting this at epoch 22 Done training discriminator on iteration: 4 [Epoch 24/300] [Batch 6/14] [D loss: -2097.180758] [G loss: 4727.187500] Done training discriminator on iteration: 5 So i would rather think it was the batchize. That insome way affect the D/G ratio

MrRobot2211 avatar Apr 22 '20 05:04 MrRobot2211

Another possible issue is the pretrain .... If you are using my pretrain it probably differs a lot from yours because of the dataset difference

MrRobot2211 avatar Apr 22 '20 05:04 MrRobot2211

Yes, when this run end I'll upload it . In any case with your dataset and a batch size of 6 I am geting this at epoch 22 Done training discriminator on iteration: 4 [Epoch 24/300] [Batch 6/14] [D loss: -2097.180758] [G loss: 4727.187500] Done training discriminator on iteration: 5 So i would rather think it was the batchize. That insome way affect the D/G ratio

I used my pretrain models before, and the psnr was alwasy about 11.

I finished this train, and got these: [Epoch 300/300] [Batch 29/29] [D loss: -152075.056702] [G loss: -52065.578125]

PSNR Avg: 10.446398 PSNR Avg: 10.814289 PSNR Avg: 10.811638 PSNR Avg: 10.657595 PSNR Avg: 10.770314 PSNR Avg: 10.719786 PSNR Avg: 11.297231 Final PSNR Avg: 11.297231

it seems to be worse than before...

mtics avatar Apr 22 '20 06:04 mtics

Hi Let's try debugging. I have trained my 1way gan on your dataset and I am geting good output results. Can you try it? I supect we are doing a small amount of generator passes in 2waygan

MrRobot2211 avatar Apr 22 '20 12:04 MrRobot2211

ok, i am trying it. but it may take a long time to get results.

mtics avatar Apr 22 '20 12:04 mtics

I tried 1WayGan_train on my mini dataset, but the result is not so good.

[Epoch 300/300] [Batch 28/29] [D loss: 150.137268] [G loss: 28.260000] [Epoch 300/300] [Batch 29/29] [D loss: 195.740005] [G loss: 28.260000]

Final PSNR Avg: 10.712687

I really don't know what makes the difference. I even didn't use multiple gpus.

mtics avatar Apr 22 '20 23:04 mtics

Sorry I meant the 1way pretrain. I have not changed anything in the 1waygan train

MrRobot2211 avatar Apr 22 '20 23:04 MrRobot2211

Oh, I also used it to pre-train my dataset, I will train on 2 way gan.

I saw you update the code last night, which code should I use, 2WayGan_Train or V2?

mtics avatar Apr 22 '20 23:04 mtics

Hi none, I was just saying that the 1waygan pretrain results seem okey. I Tried running on your dataset changing btchsize and the results are not good either. It may have been that my dataset was too small and the network was just memorizing. I am exploring making the LAMBDA parameter adaptive as in the paper. Do you have any leads on how to do it?

MrRobot2211 avatar Apr 22 '20 23:04 MrRobot2211

I don't know either. The author tried his model on 5,000 pics, I only use 1/10 of them.

I have changed LAMBDA to 1, the result was awful.

mtics avatar Apr 22 '20 23:04 mtics

Yes but ...it has to vary as we train you ensure the lipschitz condition. And get the nice swirls of the paper. Do not worry I think I have got it. Have you checked if the croppings are compatible with the original code and paper?

On Wed, Apr 22, 2020, 20:46 Zhiwei Li [email protected] wrote:

I don't know either. The author tried his model on 5,000 pics, I only use 1/10 of them.

I have changed LAMBDA to 1, the result was awful.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mtics/deep-photo-enhancer/issues/4#issuecomment-618095832, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFTMVKBCHXU433L5E4HUBI3RN56TVANCNFSM4MNQM6TA .

MrRobot2211 avatar Apr 22 '20 23:04 MrRobot2211

Yes but ...it has to vary as we train you ensure the lipschitz condition. And get the nice swirls of the paper. Do not worry I think I have got it. Have you checked if the croppings are compatible with the original code and paper? On Wed, Apr 22, 2020, 20:46 Zhiwei Li @.***> wrote: I don't know either. The author tried his model on 5,000 pics, I only use 1/10 of them. I have changed LAMBDA to 1, the result was awful. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#4 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFTMVKBCHXU433L5E4HUBI3RN56TVANCNFSM4MNQM6TA .

not yet. I'm new at deep learning, there are a lot of things I don't understand really....

mtics avatar Apr 22 '20 23:04 mtics

I merged my master branch and dev branch, you can take a look at master. With the help of my teacher and seniors, the code has been greater than the version yours based on.

mtics avatar Apr 23 '20 03:04 mtics

Yes, when this run end I'll upload it . In any case with your dataset and a batch size of 6 I am geting this at epoch 22 Done training discriminator on iteration: 4 [Epoch 24/300] [Batch 6/14] [D loss: -2097.180758] [G loss: 4727.187500] Done training discriminator on iteration: 5 So i would rather think it was the batchize. That insome way affect the D/G ratio

Could you upload your dataset?

renato-rodrig avatar Apr 15 '21 02:04 renato-rodrig