erfnet_pytorch icon indicating copy to clipboard operation
erfnet_pytorch copied to clipboard

Problem with training code

Open ftlong6666 opened this issue 5 years ago • 1 comments

Hello @Eromera Thank you so much for your work. I have a problem while running your code using the newest version of pytorch. As mentioned by pytorch that optimizer.step() should be called before lr_scheduler.step(). So I moved the scheduler outside of the step loop so it will called for every epoch after the optimizer.step()

However seems like it still give this error.

 UserWarning: NLLLoss2d has been deprecated. Please use NLLLoss instead as a drop-in replacement and see https://pytorch.org/docs/master/nn.html#torch.nn.NLLLoss for more details.
  warnings.warn("NLLLoss2d has been deprecated. "
<class '__main__.CrossEntropyLoss2d'>
----- TRAINING - EPOCH 1 -----
LEARNING RATE:  0.0005029990024359241
Traceback (most recent call last):
  File "main.py", line 510, in <module>
    main(parser.parse_args())
  File "main.py", line 483, in main
    model = train(args, model, False)   #Train decoder
  File "main.py", line 238, in train
    optimizer.step()
  File "/home/server0/anaconda3/envs/deep36_andi/lib/python3.6/site-packages/torch/optim/lr_scheduler.py", line 66, in wrapper
    return wrapped(*args, **kwargs)
  File "/home/server0/anaconda3/envs/deep36_andi/lib/python3.6/site-packages/torch/optim/adam.py", line 103, in step
    denom = (exp_avg_sq.sqrt() / math.sqrt(bias_correction2)).add_(group['eps'])
RuntimeError: CUDA error: device-side assert triggered

ftlong6666 avatar Aug 10 '20 18:08 ftlong6666

did you solve this problem?

XuKer avatar Feb 23 '22 05:02 XuKer