Question about class SILog
Hi,
In the forward function of the class SILog, currently only the shape of tensor var_error is changed: (Refer line 29 onwards in silog.py)
if var_error.ndim > 1:
var_error = var_error.mean(dim=-1)
if self.integrated > 0.0:
scale_error = mean_error**2
var_error = var_error + self.integrated * scale_error * (1 - si.int())
out_loss = self.output_fn(var_error)
return out_loss
Shouldn't we also change the shape of tensor mean_error in the exact same way?
Because if that is not done, for a batch size of 8 (and assuming num_copies=1) , the out_loss tensor has torch.Size([8,8]) instead of torch.Size([8]). Then, is the mean computed in the correct manner in class UniDepthV2?
depth_losses = loss(
outputs["depth"],
target=inputs["depth"],
mask=inputs["depth_mask"].clone(),
si=si,
)
losses["opt"][loss.name] = loss.weight * depth_losses.mean()
Perhaps I am misunderstanding the computation. Requesting your help in clarifying this.
Thank you!
Here is an example:
batch_size = 8 num_copies = 1 si: tensor([0, 0, 0, 0, 0, 0, 0, 0], device='cuda:0', dtype=torch.int32) self.integrated = 0.15
Using the current method (i.e. only changing the shape of var_error):
mean_error tensor([[ 0.8700],
[ 0.0012],
[ 0.0655],
[-0.0843],
[ 0.0229],
[-0.0147],
[-0.0198],
[-0.0270]], device='cuda:0', grad_fn=<SqueezeBackward2>)
var_error tensor([0.0003, 0.0109, 0.0089, 0.0024, 0.0006, 0.0069, 0.0199, 0.0033],
device='cuda:0', grad_fn=<MeanBackward1>)
out_loss tensor([[0.3375, 0.3529, 0.3500, 0.3406, 0.3380, 0.3471, 0.3654, 0.3420],
[0.0189, 0.1049, 0.0947, 0.0499, 0.0268, 0.0834, 0.1413, 0.0586],
[0.0316, 0.1080, 0.0980, 0.0560, 0.0369, 0.0872, 0.1436, 0.0638],
[0.0377, 0.1099, 0.1001, 0.0596, 0.0422, 0.0896, 0.1450, 0.0670],
[0.0209, 0.1053, 0.0951, 0.0507, 0.0282, 0.0839, 0.1416, 0.0592],
[0.0198, 0.1051, 0.0949, 0.0502, 0.0274, 0.0836, 0.1414, 0.0588],
[0.0204, 0.1052, 0.0950, 0.0505, 0.0279, 0.0838, 0.1415, 0.0591],
[0.0216, 0.1055, 0.0953, 0.0510, 0.0287, 0.0841, 0.1417, 0.0595]],
device='cuda:0', grad_fn=<SqrtBackward0>)
out_loss.mean() = 0.10882534831762314
Notice that for each sample, the SI log error value is actually the corresponding diagonal. Therefore, when a mean of out_loss is taken, it takes into consideration unnecessary values, which may not be the intention.
Consider the below when the shape of both mean_error and var_error is changed:
mean_error tensor([ 0.8700, 0.0012, 0.0655, -0.0843, 0.0229, -0.0147, -0.0198, -0.0270],
device='cuda:0', grad_fn=<MeanBackward1>)
var_error tensor([0.0003, 0.0109, 0.0089, 0.0024, 0.0006, 0.0069, 0.0199, 0.0033],
device='cuda:0', grad_fn=<MeanBackward1>)
out_loss_changed tensor([0.3375, 0.1049, 0.0980, 0.0596, 0.0282, 0.0836, 0.1415, 0.0595],
device='cuda:0', grad_fn=<SqrtBackward0>)
out_loss_changed.mean() = 0.11410681903362274
The final loss value for the batch changes significantly.