SinGAN
SinGAN copied to clipboard
image Super-resolution
SinGAN is a very impressive work. But in the SR mode, I do not understand the reason why SinGAN is able to generate the correct SR image. I mean, not only the correct image size but the position of objects.
For SR, SinGAN continues to upsample and generate details using the finest scale generator for several times, and is therefore able to super-resolved images. The key idea is that the image statistics tends to be similar across scales and we can therefore use the last generator to generate images of sizes greater than the LR training image size.
Thanks for your patience. I have another question: When testing, why do we upsample and refine the LR image for several times instead of upsampling the LR directly to the desired size (resolution) and then feeding it to the last generator?
for SR, SinGAN is trained with an upsampling factor of 2^(1/3) between levels, so to get an SR factor of x4 for example, will need to use the upsampling process 6 times
All right. THANKS again!
I‘ve play SinGAN for several days and curious about super resolution.
In test pharse we feed the LR image into the last generator for several times, only the last generator is used to generate HR image.
Gs_sr.append(Gs[-1])
So what is the benefit to train the previous Gs from very small pixel resized real image? Those lower scale generators aren't used in prediction. Maybe we can train fewer scales on larger real image to get detail generation generators.
I think the training process is loading last scale's G and D repeatedly, so the learning info is forward from very coarse level to the finest level. That's the reason why we need to train so many scales sequentially. Correct me if I am wrong.
if (nfc_prev==opt.nfc):
G_curr.load_state_dict(torch.load('%s/%d/netG.pth' % (opt.out_,scale_num-1)))
D_curr.load_state_dict(torch.load('%s/%d/netD.pth' % (opt.out_,scale_num-1)))