M3VSNet icon indicating copy to clipboard operation
M3VSNet copied to clipboard

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Open agenthong opened this issue 4 years ago • 16 comments

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 3, 3]] is at version 17; expected version 13 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Maybe the codes need to be changed with different variables

agenthong avatar Jan 14 '21 08:01 agenthong

I could run it successfully. Do you delete some code such as ".clone()"???

whubaichuan avatar Jan 14 '21 10:01 whubaichuan

I could run it successfully. Do you delete some code such as ".clone()"???

I don't edit the codes. But I'm wondering if it is relevant to the environment because I don't specifically check my environment to match the required environment.

agenthong avatar Jan 14 '21 11:01 agenthong

@agenthong You need to check your environment or google this error.

whubaichuan avatar Jan 15 '21 02:01 whubaichuan

@agenthong You need to check your environment or google this error.

@whubaichuan Thanks. By the way can you share the requests list hence we can directly install the envireonment.

agenthong avatar Jan 15 '21 03:01 agenthong

@agenthong Here

whubaichuan avatar Jan 15 '21 10:01 whubaichuan

@agenthong Here

Thanks for replying. I can't directly use ' pip install -r ' to install env. image Can you share a necessary env list in valid format?

agenthong avatar Jan 21 '21 15:01 agenthong

Update:I change the pytorch version to match the requirements env then this problem is solved.

agenthong avatar Jan 24 '21 04:01 agenthong

@agenthong Congratulation!

whubaichuan avatar Jan 24 '21 09:01 whubaichuan

Update:I change the pytorch version to match the requirements env then this problem is solved.

I met the same error, could you tell me which version of pytorch to solve this problem? Thanks a lot!

vangoghcat avatar Feb 24 '21 09:02 vangoghcat

@fox16789 I use pytorch 1.0.1

whubaichuan avatar Feb 25 '21 01:02 whubaichuan

Update:I change the pytorch version to match the requirements env then this problem is solved.

I met the same error, could you tell me which version of pytorch to solve this problem? Thanks a lot!

Actually I didn't solve this question even if I use the corresponding pytorch version. I'm checking the code to figure out if I can fix it.

agenthong avatar Feb 25 '21 08:02 agenthong

The error place is mvsnet.py:329 K_xyz_src = torch.matmul(intrinsics_src, xyz_src) #B*3*20480 the intrinsics_src seems to be changed in later I change the code to

intrinsics_src_ = intrinsics_src
K_xyz_src = torch.matmul(intrinsics_src_, xyz_src) #B*3*20480

it works but i don't know if it's right.

silence401 avatar Mar 01 '21 15:03 silence401

The error place is mvsnet.py:329 K_xyz_src = torch.matmul(intrinsics_src, xyz_src) #B*3*20480 the intrinsics_src seems to be changed in later I change the code to

intrinsics_src_ = intrinsics_src
K_xyz_src = torch.matmul(intrinsics_src_, xyz_src) #B*3*20480

it works but i don't know if it's righ

I tried this but it didn't work.

agenthong avatar Mar 02 '21 06:03 agenthong

The error place is mvsnet.py:329 K_xyz_src = torch.matmul(intrinsics_src, xyz_src) #B*3*20480 the intrinsics_src seems to be changed in later I change the code to

intrinsics_src_ = intrinsics_src
K_xyz_src = torch.matmul(intrinsics_src_, xyz_src) #B*3*20480

it works but i don't know if it's righ

I tried this but it didn't work.

I also change all loss+= to loss = loss + in mvsnet.py. Maybe you can try.

silence401 avatar Mar 02 '21 06:03 silence401

I also encountered the same problem. (In the same environment.) I not only tried using a=a+b instead of a+=b, but also modified the code like this. intrinsics_src_ = intrinsics_src K_xyz_src = torch.matmul(intrinsics_src_, xyz_src) #B320480

It's still not working.

The error message shows as follows:

sys:1: RuntimeWarning: Traceback of forward call that caused the error: File "train.py", line 398, in train() File "train.py", line 155, in train loss,scalar_outputs,image_outputs = train_sample(sample, detailed_summary=do_summary) File "train.py", line 271, in train_sample loss,loss_s,loss_photo,loss_ssim,mask_calculate,mask_num,loss_perceptual,loss_normal,normal_by_depth,error_depth_by_normal,depth_by_normal=model_loss(depth_est,intrinsics,extrinsics,sample_cuda["imgs"],mask_photometric,outputs_feature) File "/home/jojo/Documents/github/M3VSNet/models/mvsnet.py", line 1024, in mvsnet_loss x_src_perceptual,y_src_perceptual=project_with_depth(depth_est_perceptual, ref_intrinsics_perceptual, ref_extrinsics, src_intrinsics_perceptual, src_extrinsics[i]) File "/home/jojo/Documents/github/M3VSNet/models/mvsnet.py", line 320, in project_with_depth K_xyz_src = torch.matmul(intrinsics_src, xyz_src) #B320480

Traceback (most recent call last): File "train.py", line 398, in train() File "train.py", line 155, in train loss,scalar_outputs,image_outputs = train_sample(sample, detailed_summary=do_summary) File "train.py", line 273, in train_sample loss.backward() File "/home/jojo/anaconda3/envs/m3vsnet/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/jojo/anaconda3/envs/m3vsnet/lib/python3.6/site-packages/torch/autograd/init.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

hzlajojo avatar Oct 22 '21 08:10 hzlajojo

I also encountered the same problem. (In the same environment.) I not only tried using a=a+b instead of a+=b, but also modified the code like this. intrinsics_src_ = intrinsics_src K_xyz_src = torch.matmul(intrinsics_src_, xyz_src) #B_3_20480

It's still not working.

The error message shows as follows:

sys:1: RuntimeWarning: Traceback of forward call that caused the error: File "train.py", line 398, in train() File "train.py", line 155, in train loss,scalar_outputs,image_outputs = train_sample(sample, detailed_summary=do_summary) File "train.py", line 271, in train_sample loss,loss_s,loss_photo,loss_ssim,mask_calculate,mask_num,loss_perceptual,loss_normal,normal_by_depth,error_depth_by_normal,depth_by_normal=model_loss(depth_est,intrinsics,extrinsics,sample_cuda["imgs"],mask_photometric,outputs_feature) File "/home/jojo/Documents/github/M3VSNet/models/mvsnet.py", line 1024, in mvsnet_loss x_src_perceptual,y_src_perceptual=project_with_depth(depth_est_perceptual, ref_intrinsics_perceptual, ref_extrinsics, src_intrinsics_perceptual, src_extrinsics[i]) File "/home/jojo/Documents/github/M3VSNet/models/mvsnet.py", line 320, in project_with_depth K_xyz_src = torch.matmul(intrinsics_src, xyz_src) #B_3_20480

Traceback (most recent call last): File "train.py", line 398, in train() File "train.py", line 155, in train loss,scalar_outputs,image_outputs = train_sample(sample, detailed_summary=do_summary) File "train.py", line 273, in train_sample loss.backward() File "/home/jojo/anaconda3/envs/m3vsnet/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/jojo/anaconda3/envs/m3vsnet/lib/python3.6/site-packages/torch/autograd/init.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation


The problem has been solved on my side, by changing the code like this: K_xyz_src = torch.matmul(intrinsics_src.clone(), xyz_src.clone())

hzlajojo avatar Oct 22 '21 09:10 hzlajojo