nerfstudio
nerfstudio copied to clipboard
Gaussian splatting can not set --auto-scale-poses to False?
Describe the bug
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
when training 3DGS with --auto-scale-poses False.
To Reproduce Steps to reproduce the behavior:
ns-train gaussian-splatting --experiment-name waymo-102751-res-1-iter-50000-no-scale --max-num-iterations 50000 --pipeline.datamanager.cache-images gpu colmap --data /home/ubuntu/yifanlu/SuGaR/data/waymo-102751 --downscale-factor 1 --auto-scale-poses False
Error:
No Nerfstudio checkpoint to load, so training from scratch.
Disabled comet/tensorboard/wandb event writers
Printing profiling stats, from longest to shortest duration in seconds
Trainer.train_iteration: 2.6733
VanillaPipeline.get_train_loss_dict: 2.6713
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/nerfstudio/bin/ns-train", line 8, in <module>
sys.exit(entrypoint())
File "/home/ubuntu/yifanlu/nerfstudio/nerfstudio/scripts/train.py", line 262, in entrypoint
main(
File "/home/ubuntu/yifanlu/nerfstudio/nerfstudio/scripts/train.py", line 247, in main
launch(
File "/home/ubuntu/yifanlu/nerfstudio/nerfstudio/scripts/train.py", line 189, in launch
main_func(local_rank=0, world_size=world_size, config=config)
File "/home/ubuntu/yifanlu/nerfstudio/nerfstudio/scripts/train.py", line 100, in train_loop
trainer.train()
File "/home/ubuntu/yifanlu/nerfstudio/nerfstudio/engine/trainer.py", line 252, in train
loss, loss_dict, metrics_dict = self.train_iteration(step)
File "/home/ubuntu/yifanlu/nerfstudio/nerfstudio/utils/profiler.py", line 112, in inner
out = func(*args, **kwargs)
File "/home/ubuntu/yifanlu/nerfstudio/nerfstudio/engine/trainer.py", line 475, in train_iteration
self.grad_scaler.scale(loss).backward() # type: ignore
File "/home/ubuntu/miniconda3/envs/nerfstudio/lib/python3.8/site-packages/torch/_tensor.py", line 492, in backward
torch.autograd.backward(
File "/home/ubuntu/miniconda3/envs/nerfstudio/lib/python3.8/site-packages/torch/autograd/__init__.py", line 251, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
Expected behavior No runtime error.
Screenshots
Additional context
Other method like nerfacto can work well with my dataset with --auto-scale-poses False
.
@yifanlu0227 can you try colmap --auto-scale-poses False
instead of just --auto-scale-poses False
@maturk Hi! I think I have put colmap
before --auto-scale-poses False
. How should I modify my command? Thanks!
@maturk I got this same error RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
on a fresh install. the dozer dataset works fine on nerfacto but throws this error on splatfacto
commands to repro on my machine:
- install from source based on readme
-
ns-download-data nerfstudio --capture-name=dozer
-
ns-train splatfacto --data data/nerfstudio/dozer/
During the train run I got a warning messageload_3D_points is true, but the dataset was processed with an outdated ns-process-data that didn't convert colmap points to .ply! Update the colmap dataset automatically?
. which I clicked yes.
it works for library
dataset which also has the warning so not sure what the root cause is.
I think there are two issues that need to be addressed:
(1) the error message RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
isn't very helpful in terms of describe why it crashes. It often happens when pose is catastrophically wrong, or pose and initial point clouds are in different coordinates.
(2) The reason that --auto-scale-poses False
doesn't work may reveal a bug on transforming point clouds.
Have you solved this problem? when training 3DGS with --auto-scale-poses False,i get element 0 of tensors does not require grad and does not have a grad_fn. However, some data can be trained smoothly using -- auto scale ports, while others may encounter errors using -- auto scale ports