pytorch-auto-drive
pytorch-auto-drive copied to clipboard
inplace operation error
when I tested RESA with resnet50, error occurred. Then I tested SCNN resnet50, same issue
python main_landet.py --train --config=./configs/lane_detection/resa/resnet50_culane.py --mixed-precision
Loaded torchvision ImageNet pre-trained weights V1.
Not using distributed mode
cuda
Traceback (most recent call last):
File "main_landet.py", line 65, in
@solidexu I don't have spare gpu right now. I will try test it tomorrow.
I don't know why, the issue is solved by commenting out the relu in RESAReducer. It's too STRANGE for me.
I don't know why, the issue is solved by commenting out the relu in RESAReducer. It's too STRANGE for me.
What pytorch version are you using & do you experience this with/without mixed precision?
I use torch 1.10.2. And I have tested with/without mixed precision, same issue.
@solidexu I don't really have 1.10, but I can start training normally with 1.6.0 (I have only one card so I first changed world_size to 1 and then use only bs 2).
Here is my command:
python main_landet.py --train --config=./configs/lane_detection/resa/resnet50_culane.py --mixed-precision --batch-size=2
Are you running customized code or do you see that error in the current master branch?
In current master branch, I download your new branch three days ago in fact. Commenting out the relu also occur another error during training. I think I can try torch 1.6.0
try to add a 1*1 conv at the top layer of RESA, it may be helpful
@solidexu Sorry to disturb, but did you solve this issue by down-grading pytorch? I think it is encountered by others as well.
close by #121