PUT the qualitative comparison with PUTconv in figure 7

the qualitative comparison with PUTconv in figure 7

Open CyrilCsy opened this issue 2 years ago • 3 comments

I am very curious why there is such a big gap with the general CNN-based encoder for results. CNN should be able to learn to distinguish the masked region to a certain extent.

Sep 01 '22 15:09 CyrilCsy

Hi @CyrilCsy,

Thanks for your interests in our work. CNN-based encoder indeed can learn a good feature. However, the feature is not suitable for UQ-Transformer (it is good for reconstruction). The reason is that the masked regions (zero pixels) will have a negative impact on other unmasked regions. For PUT, the main artifacts is that a patch is easily to be predicted as black (zero pixels) if: 1) a partially masked patch contain some black pixels; 2) lots of black pixels in unmasked regions. The CNN-based convolution will transfer the black pixels from masked region to unmasked region, which will have a significant negative impact on the inpainted images.

By the way, I also have tried to fix this artifact. The next version of PUT is on the way.

Sep 02 '22 02:09 liuqk3

Thanks for answering my confusion. I'm trying to train the model in Places, and I set batchsize=64 and keep epoch unchanged(100) when training pvqvae. But it shows that training takes more than 20 days, I wonder if this is necessary？If I just train it 10 epoch, will it make a big difference in the effect?

Sep 08 '22 14:09 CyrilCsy

Hi @CyrilCsy ,

According to my experience, P-VQVAE can achieve a promising reconstruction capability when the number of epochs are reduced. But you need to pay attention to some settings. For example, the number of iterations for warming up, the number of iterations when some losses are introduced (Discriminator, LPIPS, etc). You'd better set these number of iterations according to ratio of total number of iterations.

Sep 10 '22 16:09 liuqk3

PUT PUT copied to clipboard

the qualitative comparison with PUTconv in figure 7

PUT
PUT copied to clipboard