pytorch-adacos
pytorch-adacos copied to clipboard
nan value
While trying to replicate adacos we find the B_avg tending to inf. can u help me with this.
m=0.5 B_avg value before inf = 8.3499e+35
Thanks
Are you training with your own dataset? Can you tell me more details?
We are training with VGGFace2 dataset.
I experienced this issue. It seems related to this other issue.
My fix is to change the optimizer from
optimizer = optim.SGD(filter(lambda p: p.requires_grad, model.parameters()), lr=args.lr,
momentum=args.momentum, weight_decay=args.weight_decay)
to
from itertools import chain
optimizer = optim.SGD(filter(lambda p: p.requires_grad, chain(model.parameters(),metric_fc.parameters())),
lr=args.lr,momentum=args.momentum, weight_decay=args.weight_decay)
I found another issue that raises the nan value.
The scale variable s should be updated during training only,i.e., using the training split. However, it is updated every time the forward method is called. Thus, it is currently updated using both the training and testing splits. I found that to raise nan value frequently. So I changed
with torch.no_grad():
........
self.s = torch.log(B_avg) / torch.cos(torch.min(math.pi/4 * torch.ones_like(theta_med), theta_med))
to
if self.training:
with torch.no_grad():
........
self.s = torch.log(B_avg) / torch.cos(torch.min(math.pi/4 * torch.ones_like(theta_med), theta_med))
self.training is already defined inside AdaCos because it is nn.Module. So there is no need to define this variable.