nerfstudio
nerfstudio copied to clipboard
When there are lots of training images, bad things happen
Is your feature request related to a problem? Please describe.
If I am loading lots of images (e.g. ~5000) with the Nerfstudio
data parser, two things happen:
- I have to set
train_num_images_to_sample_from
to something reasonable to prevent OOM, like 300. But then I also have to hard-code anum_times_to_repeat_images
, since this is parameter is not set by the config.- The Record3D data parser kinda solves this by only loading a
max_dataset_size
number of images. But then it's stuck with the same images for the entire training, which doesn't feel right.
- The Record3D data parser kinda solves this by only loading a
- The viewer gets really laggy trying to load all the images
Describe the solution you'd like Some quality of life default values so I do not have to use so many command-line arguments:
- [ ] Set the default values for
train_num_images_to_sample_from
andeval_num_images_to_sample_from
to something reasonable, like 300, and then add some logic to detect when it is suitable to cache all images (notably when the dataset contains < 300 images) - [x] Limit the number of images shown in the viewer to something reasonable, like 512, and control this value in the config. Note that this only changes what is shown in the viewer, and does not change the actual images used in training.
- [ ] Add
train_num_times_to_repeat_images
andeval_num_times_to_repeat_images
parameters to the config, with default values-1
I appreciate any feedback on the above proposed changes, since I admit I am a new user and I don't want to make breaking changes to the API that people don't like. But I hope that my pain points were felt by others as well.
Describe alternatives you've considered
Additional context If this feature is desired, I can implement.
Yea the current support for large datasets isn't great and needs some updates. In your proposed solution you mention setting the default to 300 in the first bullet and setting it to -1 in the last bullet. Can you clarify?
(ps. Responses may be a bit delayed the next few days since a number of us are at a conference at the moment)
No problem! None of this is urgent. Thanks for responding.
Oops! my bad--I will edit now. I meant to set the size of the collated batch to 300, and the number of iterations to use that batch to -1 (i.e. do not repick images)