nerfstudio
nerfstudio copied to clipboard
output accumulation value is very negative
After training for a long time ~81,990 steps on instant ngp, get the following stack trace:
Hmm this is after #104 was merged right? ~~I was still getting NaNs, but never this early (maybe at 140k~200k steps).~~ (edit: I ran the train script several times in parallel; one of the runs did produce a NaN in the RGB MLP at around 90k steps)
As an FYI, -9223372036854775808
is what we get from:
>>> torch.tensor(torch.nan).long()
tensor(-9223372036854775808)
Im going to reopen for now. The NAN can be fixed if the precision is changes, but the performance worsens.
Seems like still ongoing problem. I believe this is known ongoing issue but posting stacktrace as update.
Error after 174,990 steps:
Hi @evonneng could you try the following solution https://github.com/nerfstudio-project/nerfstudio/pull/910 and let me know if this fixes your issue?