GuruGuha
GuruGuha
because autograd.grad takes in a sequence(tuple) of tensors (w.r.t which the gradient of ouput have to be computed as inputs) the return is also a sequence(tuple) of gradient tensors w.r.t...
Did you try training the darknet_53 model using Adam ? Curiously, I notice the convergence is a lot worser than what it is using SGD... this is contrary to my...
@Amar1729 Any update on this ?
@drscotthawley , Any luck with this ?