ST-TR
ST-TR copied to clipboard
A problem about error: "inplace operation"
Hi, Chiaraplizz, I'd like to consult a problem encountered in running the code. When I start training process, following error has occured:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 512, 75, 25]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
My device config is set as [0, 1, 2, 3] and use dataParallel for multi-processor calculating, which is the same as your configuration in the responsitory. Did youmeet the same issue? I can't appreciate more if you can help me solve this problem.
Me too, is there a way?
Hi, Iywinaaa, I faced the same problem before. I fixed it by changing the pytorch to a lower version. Hope it is helpful.
好的,非常感谢!
Kuroshika @.***> 于2023年4月28日周五 10:23写道:
Hi, Iywinaaa, I faced the same problem before. I fixed it by changing the pytorch to a lower version. Hope it is helpful.
— Reply to this email directly, view it on GitHub https://github.com/Chiaraplizz/ST-TR/issues/36#issuecomment-1526887602, or unsubscribe https://github.com/notifications/unsubscribe-auth/A42XPTPR6JJCB7WB6JSZ3YTXDMSYZANCNFSM6AAAAAASY76MME . You are receiving this because you commented.Message ID: @.***>
Hi, Iywinaaa, I faced the same problem before. I fixed it by changing the pytorch to a lower version. Hope it is helpful.
Hello, may I ask which version of Python did you downgrade to without reporting errors?
My pytorch version is 1.5.1 and the cudatookit version is 10.1, while my torchvision version is 0.6.1. I will be glad if this may help you! 我用的pytorch版本是1.5.1,cudatookit是10.1,torchvision版本是0.6.1,不太建议在30系显卡上运行,因为我没在3090上复现成功:(,但是在一台titan v的电脑上复现成功了。
Hi, Chiaraplizz, I'd like to consult a problem encountered in running the code. When I start training process, following error has occured:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 512, 75, 25]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
My device config is set as [0, 1, 2, 3] and use dataParallel for multi-processor calculating, which is the same as your configuration in the responsitory. Did youmeet the same issue? I can't appreciate more if you can help me solve this problem.
Hello can you tell me how you solved this issue?
Hello @AnainaM have you been able to resolve this issue? I recently came across this issue and my cuda version is 11.6
A response would be helpful.
@olayinkaajayi Actually I was not able to fix this till now. If you get the solution please share it with me as well. Thank you.
All right then, I'll be happy to share if I figure it out. By the way are you doing a PhD research related to skeleton-based action recognition?