RepDistiller
RepDistiller copied to clipboard
what is the difference between the position of putting "with torch.no_grad()"
Hi, Thank you for your contribution,the code is very useful for me,but I want to ask you a question about this code:
with torch.no_grad():
l_pos = torch.index_select(self.memory_v1, 0, y.view(-1))
l_pos.mul_(momentum)
l_pos.add_(torch.mul(v1, 1 - momentum))
l_norm = l_pos.pow(2).sum(1, keepdim=True).pow(0.5)
updated_v1 = l_pos.div(l_norm)
self.memory_v1.index_copy_(0, y, updated_v1)
ab_pos = torch.index_select(self.memory_v2, 0, y.view(-1))
ab_pos.mul_(momentum)
ab_pos.add_(torch.mul(v2, 1 - momentum))
ab_norm = ab_pos.pow(2).sum(1, keepdim=True).pow(0.5)
updated_v2 = ab_pos.div(ab_norm)
self.memory_v2.index_copy_(0, y, updated_v2)
In your Implemention of the paper,you calc the loss element first and then update the memory,
if I update the memory first, and then calc the loss element, what is the difference between these two methods,
Looking forward to your reply!Thank you!
I think generally it's fine, except that you might sample the same feature as your anchor, either as positive or negative.