alpha_mix_active_learning
alpha_mix_active_learning copied to clipboard
closed_form_approx grads-calculation
loss = F.cross_entropy(out, pred_1.to(self.device))
grads = torch.autograd.grad(loss, var_emb)[0].data.cpu()
In my understanding both out and pred_1 hold the same passage of the unlabeled samples through the model, but the former is neglecting Softmax (and Argmax of course since it stores the possibilities of all classes). Grads will then compute the sum of gradients of var_emb with respect to this loss.
I don't really get that part. I suppose it is a part of the formula for eq 5 but could you please elaborate further on what exactly do the gradients stored in grads mean??