RepDistiller Problem of the order of the normalization in Similarity-Preserving loss.

Problem of the order of the normalization in Similarity-Preserving loss.

Open seacj opened this issue 3 years ago • 0 comments

In the paper for Similarity-Preserving loss. The normalization is before the operation of matrix Multiplication. Does the order matter the performance.

import torch
org_f_s = torch.rand((64, 96))
org_f_t = torch.rand((64, 96))

bsz = f_s.shape[0]
f_s = org_f_s.view(bsz, -1)
f_t = org_f_t.view(bsz, -1)

G_s = torch.mm(f_s, torch.t(f_s))
# G_s = G_s / G_s.norm(2)
G_s = torch.nn.functional.normalize(G_s)
G_t = torch.mm(f_t, torch.t(f_t))
# G_t = G_t / G_t.norm(2)
G_t = torch.nn.functional.normalize(G_t)

G_diff = G_t - G_s
loss = (G_diff * G_diff).view(-1, 1).sum(0) / (bsz * bsz)
print(loss)

f_s = org_f_s.view(bsz, -1)
f_t = org_f_t.view(bsz, -1)

f_s = torch.nn.functional.normalize(f_s)
G_s = torch.mm(f_s, torch.t(f_s))
# G_s = G_s / G_s.norm(2)

f_t = torch.nn.functional.normalize(f_t)
G_t = torch.mm(f_t, torch.t(f_t))
# G_t = G_t / G_t.norm(2)
G_diff = G_t - G_s
loss = (G_diff * G_diff).view(-1, 1).sum(0) / (bsz * bsz)
print(loss)

Dec 07 '21 09:12 seacj

RepDistiller RepDistiller copied to clipboard

Problem of the order of the normalization in Similarity-Preserving loss.

RepDistiller
RepDistiller copied to clipboard