Deep-Mutual-Learning
Deep-Mutual-Learning copied to clipboard
Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
when updating the sub network, is there any need to retain graph like
loss.backward(retain_graph=True)
because when i reproduce the procedure, the code runs wrong, but i dont know if retaining the graph is a correct operation
@shaoeric When I ran this code, this error didn't appear. But it seems like to be correct if your code does report this error. There may indeed be some bugs in this code because it is not under maintenance. The core of the code is the implementation of KL divergence loss and I verified that this part is correct.
@chxy95 So sorry, i solved the problem without closing the issue. As for this problem, i have got some solutions.
- First, check out pytorch version, perhaps a newer version causes the confusing problem
- check whether the teacher model's output is detached from the graph
teacher_out = teacher_out.detach()
- if we have more than one teacher models, putting their outputs into a list is better than a tensor, even if the tensor is detached.
Yeah, u should use detach , because your gridient will be freed the first time you use kl and backward