contextual_loss_pytorch
contextual_loss_pytorch copied to clipboard
water ripple artifacts when using CoBi Loss
when i training ESRGAN using Contextual Bilateral Loss, without l1/perceptual/GAN loss
all the inference results seem to have artifacts like water ripple in smooth areas
Do you have any idea how might this artifact come about?
@conson0214 I think it will be caused by using only CoBi_{VGG} in Eq. (3). Even though it causes over smoothing, the pixel wise loss has very important roles In SR problems. So I suggest to use CoBi_{RGB}.
the loss function in my traning codes as following, i use CoBi_{RGB} as a part of total loss, but i`m wondering if I define it properly?
import contextual_loss as cl
criterion_rgb = cl.ContextualBilateralLoss(use_vgg=False, loss_type='cosine').to(self.device) l_cobirgb = criterion_rgb(self.fake_H, self.real_H)
criterion_relu1_2 = cl.ContextualBilateralLoss(use_vgg=True, loss_type='cosine', vgg_layer='relu1_2').to(self.device) l_cobirelu1_2 = criterion_relu1_2(self.fake_H, self.real_H)
criterion_relu2_2 = cl.ContextualBilateralLoss(use_vgg=True, loss_type='cosine', vgg_layer='relu2_2').to(self.device) l_cobirelu2_2 = criterion_relu2_2(self.fake_H, self.real_H)
criterion_relu3_4 = cl.ContextualBilateralLoss(use_vgg=True, loss_type='cosine', vgg_layer='relu3_4').to(self.device) l_cobirelu3_4 = criterion_relu3_4(self.fake_H, self.real_H)
l_total = l_cobirgb + l_cobirelu1_2 + l_cobirelu2_2 + 0.5*l_cobirelu3_4
The paper uses n×n RGB patches as features for CoBi_RGB. Your code compares single RGB value only.
How to calculate CoBi_RGB using nxn RGB patches as features? I`m confused when reading this part of paper. like this 1/w*h * Σ{criterion_rgb(self.fake_H(nxn), self.real_H(nxn))}? calculate CoBi_RGB per pixel in its nxn neighbourhood, then average them in whole image? Am I right?
It probably means the input of CoBi_RGB should be the vectors of n×n RGB values. Currently, this package does not support the feature conversion so you need to define it outside of the package.
For example:
# dummy image, shape: (n, c, h, w)
img = torch.rand(n, c, h, w)
# sample patches, shape: (n, c, kernel_size, kernel_size, n_patches)
patches = sample_patches(
img, kernel_size=3, stride=2, padding=0)
# convert to vectors, shape: (n, c*kernel_size*kernel_size, n_patches, 1)
vectors = patches.reshape(n, -1, n_patches, 1)
criterion = ContextualBilateralLoss()
loss = criterion(vectors, vectors)
Whats the best way to implement this?
sample_patches(img, kernel_size=3, stride=2, padding=0)
Something like this?
def sample_patches(x, kernel_size=3, stride=2, padding=0):
x = F.pad(x, (padding//2,padding//2, padding//2, padding//2))
# Extract patches
patches = x.unfold(2, kernel_size, stride).unfold(3, kernel_size, stride)
patches = patches.permute(0,4,5,1,2,3).contiguous()
return patches.view(b,-1,patches.shape[-2], patches.shape[-1])
@varun19299 maybe
This code will help you. https://github.com/S-aiueo32/srntt-pytorch/blob/master/models/swapper.py#L198-L230
Yes, this works: modified for including batch size.
Thanks for the quick reply.
Also, with regards to the OOM issue: do you recommend using cosine distance for CoBi RGB too?
I've tried three types of loss_type, and they all seem to have OOM problem.
Did you extract patches before applying CoBi/ CX?