RepDistiller what is the difference between the position of putting "with torch.no

what is the difference between the position of putting "with torch.no_grad()"

Open ChriswooTalent opened this issue 4 years ago • 1 comments

Hi, Thank you for your contribution,the code is very useful for me,but I want to ask you a question about this code:

 with torch.no_grad():
        l_pos = torch.index_select(self.memory_v1, 0, y.view(-1))
        l_pos.mul_(momentum)
        l_pos.add_(torch.mul(v1, 1 - momentum))
        l_norm = l_pos.pow(2).sum(1, keepdim=True).pow(0.5)
        updated_v1 = l_pos.div(l_norm)
        self.memory_v1.index_copy_(0, y, updated_v1)
        ab_pos = torch.index_select(self.memory_v2, 0, y.view(-1))
        ab_pos.mul_(momentum)
        ab_pos.add_(torch.mul(v2, 1 - momentum))
        ab_norm = ab_pos.pow(2).sum(1, keepdim=True).pow(0.5)
        updated_v2 = ab_pos.div(ab_norm)
        self.memory_v2.index_copy_(0, y, updated_v2)



In your Implemention of the paper,you calc the loss element first and then update the memory, 
if I update the memory first, and then calc the loss element, what is the difference between these two methods,

Looking forward to your reply！Thank you！

Dec 10 '20 02:12 ChriswooTalent

I think generally it's fine, except that you might sample the same feature as your anchor, either as positive or negative.

Dec 17 '20 05:12 HobbitLong

RepDistiller RepDistiller copied to clipboard

what is the difference between the position of putting "with torch.no_grad()"

RepDistiller
RepDistiller copied to clipboard