ST-TR icon indicating copy to clipboard operation
ST-TR copied to clipboard

A problem about error: "inplace operation"

Open Goldfish0106 opened this issue 2 years ago • 9 comments

Hi, Chiaraplizz, I'd like to consult a problem encountered in running the code. When I start training process, following error has occured:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 512, 75, 25]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

My device config is set as [0, 1, 2, 3] and use dataParallel for multi-processor calculating, which is the same as your configuration in the responsitory. Did youmeet the same issue? I can't appreciate more if you can help me solve this problem.

Goldfish0106 avatar Dec 09 '22 06:12 Goldfish0106

Me too, is there a way?

lywinaaa avatar Apr 10 '23 12:04 lywinaaa

Hi, Iywinaaa, I faced the same problem before. I fixed it by changing the pytorch to a lower version. Hope it is helpful.

Kuroshika avatar Apr 28 '23 02:04 Kuroshika

好的,非常感谢!

Kuroshika @.***> 于2023年4月28日周五 10:23写道:

Hi, Iywinaaa, I faced the same problem before. I fixed it by changing the pytorch to a lower version. Hope it is helpful.

— Reply to this email directly, view it on GitHub https://github.com/Chiaraplizz/ST-TR/issues/36#issuecomment-1526887602, or unsubscribe https://github.com/notifications/unsubscribe-auth/A42XPTPR6JJCB7WB6JSZ3YTXDMSYZANCNFSM6AAAAAASY76MME . You are receiving this because you commented.Message ID: @.***>

lywinaaa avatar May 05 '23 14:05 lywinaaa

Hi, Iywinaaa, I faced the same problem before. I fixed it by changing the pytorch to a lower version. Hope it is helpful.

Hello, may I ask which version of Python did you downgrade to without reporting errors?

xiegedaimazhenfeijin avatar May 11 '23 13:05 xiegedaimazhenfeijin

My pytorch version is 1.5.1 and the cudatookit version is 10.1, while my torchvision version is 0.6.1. I will be glad if this may help you! 我用的pytorch版本是1.5.1,cudatookit是10.1,torchvision版本是0.6.1,不太建议在30系显卡上运行,因为我没在3090上复现成功:(,但是在一台titan v的电脑上复现成功了。

Kuroshika avatar May 11 '23 13:05 Kuroshika

Hi, Chiaraplizz, I'd like to consult a problem encountered in running the code. When I start training process, following error has occured:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 512, 75, 25]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

My device config is set as [0, 1, 2, 3] and use dataParallel for multi-processor calculating, which is the same as your configuration in the responsitory. Did youmeet the same issue? I can't appreciate more if you can help me solve this problem.

Hello can you tell me how you solved this issue?

AnainaM avatar Aug 06 '23 08:08 AnainaM

Hello @AnainaM have you been able to resolve this issue? I recently came across this issue and my cuda version is 11.6

A response would be helpful.

olayinkaajayi avatar Aug 24 '23 15:08 olayinkaajayi

@olayinkaajayi Actually I was not able to fix this till now. If you get the solution please share it with me as well. Thank you.

AnainaM avatar Aug 24 '23 16:08 AnainaM

All right then, I'll be happy to share if I figure it out. By the way are you doing a PhD research related to skeleton-based action recognition?

olayinkaajayi avatar Aug 24 '23 16:08 olayinkaajayi