nerfstudio icon indicating copy to clipboard operation
nerfstudio copied to clipboard

Viewer Division by Zero

Open akristoffersen opened this issue 3 years ago • 3 comments

Traceback (most recent call last): File "scripts/run_train.py", line 215, in main(dcargs.cli(AnnotatedBaseConfigUnion)) File "scripts/run_train.py", line 203, in main launch( File "scripts/run_train.py", line 144, in launch main_func(local_rank=0, world_size=1, config=config) File "/home/eecs/akristoffersen/kair/pyrad/nerfactory/engine/trainer.py", line 56, in train_loop trainer.train() File "/home/eecs/akristoffersen/kair/pyrad/nerfactory/engine/trainer.py", line 142, in train self.visualizer_state.update_scene(step, self.pipeline.model, num_rays_per_batch) File "/home/eecs/akristoffersen/kair/pyrad/nerfactory/utils/decorators.py", line 58, in wrapper ret = func(self, *args, **kwargs) File "/home/eecs/akristoffersen/kair/pyrad/nerfactory/viewer/server/viewer_utils.py", line 330, in update_scene if step % num_steps == 0: ZeroDivisionError: integer division or modulo by zero

This was with running on vanilla NeRF. I think maybe the viewer did that check before any steps had occurred or something.

akristoffersen avatar Sep 14 '22 23:09 akristoffersen

My guess is that the method is so slow that it can't satisfy the constraints of the fps-resolution calculation. It tries to determine how many training steps should be taken in a 1 second period such that it takes a total time of Train Util seconds. By default this value is 0.9. This logic will need to change for very slow methods ( or we present an error message in the viewer saying that it isn't capable)

tancik avatar Sep 15 '22 00:09 tancik

Getting this issue consistently using Nerfacto on a custom polycam data set:

File "C:\Users\Rendering-PC-1\.conda\envs\nerfstudio\lib\site-packages\nerfstudio\engine\trainer.py", line 211, in train duration=self.config.pipeline.datamanager.train_num_rays_per_batch / train_t.duration, ZeroDivisionError: float division by zero

saphtea avatar Jan 27 '23 22:01 saphtea

EDIT: issue still occurs with both the poster dataset and custom datasets

saphtea avatar Jan 27 '23 23:01 saphtea

Fixed

tancik avatar Feb 22 '23 18:02 tancik

I still get a ZeroDivisionError in nerfstudio\engine\trainer.py:252 with the Instant NGP implementation in nerfstudio==0.3.1 The line is duration=self.world_size * self.pipeline.datamanager.get_train_rays_per_batch() / train_t.duration where train_t.duration can become 0.0

tobias-kirschstein avatar Jun 21 '23 20:06 tobias-kirschstein