LiDAR-MOS icon indicating copy to clipboard operation
LiDAR-MOS copied to clipboard

RuntimeError("grad can be implicitly created only for scalar outputs")

Open e1339g opened this issue 1 year ago • 1 comments

Hello, thank you for your great work. I am training my own dataset and encountered the following error.

Ignoring class  0  in IoU evaluation
[IOU EVAL] IGNORE:  tensor([0])
[IOU EVAL] INCLUDE:  tensor([1, 2])
Lr: 3.106e-05 | Update: 2.258e-01 mean,4.181e-01 std | Epoch: [0][0/322] | Time 3.170 (3.170) | Data 0.154 (0.154) | Loss 1.9250 (1.9250) | acc 0.533 (0.533) | IoU 0.363 (0.363) | [1 day, 20:35:54]
Traceback (most recent call last):
  File "/content/LiDAR-MOS/mos_SalsaNext/train/tasks/semantic/train.py", line 178, in <module>
    trainer.train()
  File "../../tasks/semantic/modules/trainer.py", line 274, in train
    show_scans=self.ARCH["train"]["show_scans"])
  File "../../tasks/semantic/modules/trainer.py", line 391, in train_epoch
    loss_m.backward()
  File "/usr/local/lib/python3.7/dist-packages/torch/_tensor.py", line 396, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py", line 166, in backward
    grad_tensors_ = _make_grads(tensors, grad_tensors_, is_grads_batched=False)
  File "/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py", line 67, in _make_grads
    raise RuntimeError("grad can be implicitly created only for scalar outputs")
RuntimeError: grad can be implicitly created only for scalar outputs
  optimizer.zero_grad()
            if self.n_gpus > 1:
                idx = torch.ones(self.n_gpus).cuda()
                loss_m.backward(idx)
            else:
                loss_m.backward() #here i got the error
            optimizer.step()

I have looked the error in google, and it usually happens when you use two or more GPUs. However, I am using only one GPU and got this error. Could you please help me to solve this error.

e1339g avatar Sep 11 '22 09:09 e1339g

It looks like a version conflict problem and I have never met this problem before.

Could you please provide your environment info here? Let's see whether other users have any solutions or not.

Chen-Xieyuanli avatar Sep 22 '22 03:09 Chen-Xieyuanli