kaolin-wisp icon indicating copy to clipboard operation
kaolin-wisp copied to clipboard

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Open cv-dote opened this issue 3 years ago • 1 comments

Thanks for this amazing library!
I tried to train the VQAD from the RTMV dataset and I'm getting the following error:

$python3 app/main.py --config configs/vqad_nerf.yaml --dataset-path ../dataset/rtmv/amazon_berkely-003/00001/
...
Traceback (most recent call last):
  File "app/main.py", line 35, in <module>
    trainer.train()
  File "/home/zaima/research/kaolin-wisp/wisp/trainers/base_trainer.py", line 473, in train
    self.iterate()
  File "/home/zaima/research/kaolin-wisp/wisp/trainers/base_trainer.py", line 340, in iterate
    self.step(self.epoch, self.iteration, data)
  File "/home/zaima/research/kaolin-wisp/wisp/trainers/multiview_trainer.py", line 88, in step
    self.scaler.scale(loss).backward()
  File "/home/zaima/anaconda3/envs/wisp/lib/python3.8/site-packages/torch/_tensor.py", line 363, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/home/zaima/anaconda3/envs/wisp/lib/python3.8/site-packages/torch/autograd/__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

My Environment

CUDA v11.3
ubuntu 20.04
GeForce RTX 3090

I've already tried to train that from V8 dataset and it finished successfully. I checked this RTMV dataset class https://github.com/NVIDIAGameWorks/kaolin-wisp/blob/main/wisp/datasets/formats/rtmv.py , but I don't understand why I can't use RTMV datasets other than V8.
Thanks in advance!

cv-dote avatar Aug 22 '22 05:08 cv-dote

Hi, I ran into the same error when trying to train with the octree, triplanar, and codebook feature grids on a 'standard' format multiview dataset (but did not get the error with the hashgrid). I was just wondering if there were any updates on this issue.

Additional comment: When I change the raymarch_type config parameter to 'ray' instead of 'voxel', the error goes away. However, I'm not sure what the root cause is.

Salarios77 avatar Sep 06 '22 18:09 Salarios77

Hi @Salarios77 @cv-dote , I've recently updated the tracers to support no intersection, do you still have this kind of issues?

Caenorst avatar Oct 13 '22 14:10 Caenorst

I no longer have the issue, thank you very much for addressing it!

Salarios77 avatar Oct 13 '22 19:10 Salarios77

The issue has been resolved for me too. Thank you for updating that.
I will close this issue.

cv-dote avatar Oct 19 '22 05:10 cv-dote