sdfstudio
sdfstudio copied to clipboard
assertion error training on tanks-and-temple/scan3
Describe the bug
Here is the traceback when training on the sdfstudio dataset tanks-and-temple/scan3:
122440 (12.24%) 116.110 ms 17.64 K 15.88 K
122450 (12.24%) 116.215 ms 17.62 K
122460 (12.25%) 116.229 ms 17.62 K
122470 (12.25%) 116.068 ms 17.65 K
122480 (12.25%) 116.300 ms 17.61 K 16.03 K
122490 (12.25%) 118.293 ms 17.35 K
122500 (12.25%) 118.894 ms 17.28 K
122510 (12.25%) 117.497 ms 17.45 K
122520 (12.25%) 116.755 ms 17.54 K 16.18 K
122530 (12.25%) 116.402 ms 17.60 K
----------------------------------------------------------------------------------------------------
Viewer at: https://viewer.nerf.studio/versions/23-03-9-0/?websocket_url=ws://localhost:7007
Printing profiling stats, from longest to shortest duration in seconds
ViewerState._render_image_in_viewer: 0.1586
Trainer.train_iteration: 0.0529
VanillaPipeline.get_train_loss_dict: 0.0398
Traceback (most recent call last):
File "/opt/conda/envs/sdf/bin/ns-train", line 8, in <module>
sys.exit(entrypoint())
File "/home/renzhen/userdata/repo/sdfstudio/scripts/train.py", line 250, in entrypoint
main(
File "/home/renzhen/userdata/repo/sdfstudio/scripts/train.py", line 236, in main
launch(
File "/home/renzhen/userdata/repo/sdfstudio/scripts/train.py", line 175, in launch
main_func(local_rank=0, world_size=world_size, config=config)
File "/home/renzhen/userdata/repo/sdfstudio/scripts/train.py", line 90, in train_loop
trainer.train()
File "/home/renzhen/userdata/repo/sdfstudio/nerfstudio/engine/trainer.py", line 151, in train
loss, loss_dict, metrics_dict = self.train_iteration(step)
File "/home/renzhen/userdata/repo/sdfstudio/nerfstudio/utils/profiler.py", line 43, in wrapper
ret = func(*args, **kwargs)
File "/home/renzhen/userdata/repo/sdfstudio/nerfstudio/engine/trainer.py", line 319, in train_iteration
_, loss_dict, metrics_dict = self.pipeline.get_train_loss_dict(step=step)
File "/home/renzhen/userdata/repo/sdfstudio/nerfstudio/utils/profiler.py", line 43, in wrapper
ret = func(*args, **kwargs)
File "/home/renzhen/userdata/repo/sdfstudio/nerfstudio/pipelines/base_pipeline.py", line 273, in get_train_loss_dict
loss_dict = self.model.get_loss_dict(model_outputs, batch, metrics_dict)
File "/home/renzhen/userdata/repo/sdfstudio/nerfstudio/models/neus_facto.py", line 308, in get_loss_dict
loss_dict["interlevel_loss"] = self.config.interlevel_loss_mult * interlevel_loss_zip(
File "/home/renzhen/userdata/repo/sdfstudio/nerfstudio/model_components/losses.py", line 144, in interlevel_loss_zip
assert (y_r >= 0.0).all()
AssertionError
To Reproduce
- download data via
ns-download-data ns-train neus-facto-angelo --data tanks-and-temple/scan3/ --pipeline.model.sdf-field.inside-outside True sdfstudio-data --include-mono-prior True
Expected behavior Training model without error.
Additional context
torch 1.12.1+cu113 pypi_0 pypi
tinycudann 1.7 pypi_0 pypi
nerfstudio 0.1.12
Hi, sorry for the late reply. the interlevel_loss_zip is not numerically stable. You can change it to interlevel_loss.
@niujinshuchong @Legend94rz
Hi, sorry for the late reply. the interlevel_loss_zip is not numerically stable. You can change it to interlevel_loss.
Can i know how to change it ,is it a parameter ??
Should I change here
if self.training:
loss_dict["interlevel_loss"] = self.config.interlevel_loss_mult * interlevel_loss_zip(
outputs["weights_list"], outputs["ray_samples_list"]
)
To
if self.training:
loss_dict["interlevel_loss"] = self.config.interlevel_loss_mult * interlevel_loss(
outputs["weights_list"], outputs["ray_samples_list"]
)