PapersAnalysis
PapersAnalysis copied to clipboard
VGG Loss
Implementation Analysis
Code first
Let's take the implementation here
class VGGLoss(nn.Module):
def __init__(self, gpu_ids):
super(VGGLoss, self).__init__()
self.vgg = Vgg19().cuda()
self.criterion = nn.L1Loss()
self.weights = [1.0/32, 1.0/16, 1.0/8, 1.0/4, 1.0]
def forward(self, x, y):
x_vgg, y_vgg = self.vgg(x), self.vgg(y)
loss = 0
for i in range(len(x_vgg)):
loss += self.weights[i] * self.criterion(x_vgg[i], y_vgg[i].detach())
return loss
Explanation
VGG Loss idea is simple: it is a combination of a projection + distance
The purpose of the projection is to transform the initial raw, possibly sensor-related, representation, into a more semantic representation
In this case, since the VGG convolutional backbone is used to implement this projection, the resulting representation is the stacked set of VGG feature images
The distance function used has to be a distance that works well with images then and in this case L1 loss is used but other choices are clearly possible
Finally, since this is not a distance between an images pair but between a pair of images stacks, we have to sum over the stack depth to get a number at the end of the process Furthermore, it is important to observe the individual depth-specific contributions are not directly comparable since the resolution is different so a weighted sum is needed giving more weights to the lower resolution images
Latex
f_{VGG}(I) \rightarrow {H_{i}}_{i=1,...,n}
I \in \mathcal{I} \quad H \in \mathcal{H}
f_{VGG}(I^{(1,2)}) \rightarrow {H_{i}}_{i=1,...,n}^{(1,2)}
L(H^{(1)}, H^{(2)}) \rightarrow \mathbb{R}
L_{1}(H_{i}^{(1)}, H_{i}^{(2)})
D = \sum_{i=1}^{n} w_{i} L_{1}(H_{i}^{(1)}, H_{i}^{(2)})