nerfstudio icon indicating copy to clipboard operation
nerfstudio copied to clipboard

Unable to load a previous checkpoint (Record3d)

Open francescofugazzi opened this issue 1 year ago • 5 comments

Hi,

I trained a dataset until the default end. Now I want to reload it for setting the camera but I'm unable to do so.

If I use this string ns-train nerfacto record3d-data --data data\record-3d\Palma --trainer.load-dir outputs\data\record-3d\Palma\nerfacto\2022-10-09_150158\nerfstudio_models\step-000029999.ckpt --vis viewer --viewer.websocket-port 7007 --trainer.load-config outputs\data\record-3d\Palma\nerfacto\2022-10-09_150158\config.yml --viewer.start-train False It doesn't recognize all the parameters after --data If I remove record3d-data it loads all the parameters but wants to start the traiing from scratch

francescofugazzi avatar Oct 09 '22 14:10 francescofugazzi

Hi! The ordering of the parameters matters. Specifically, anything after the data parser type: e.g. record3d-data in your case, can only be parameters related to that data parser. So please try this command:

ns-train nerfacto --trainer.load-dir outputs\data\record-3d\Palma\nerfacto\2022-10-09_150158\nerfstudio_models\step-000029999.ckpt --vis viewer --viewer.websocket-port 7007 --trainer.load-config outputs\data\record-3d\Palma\nerfacto\2022-10-09_150158\config.yml --viewer.start-train False record3d-data --data data\record-3d\Palma 

This is because of the way the CLI is structured. We know it is a bit confusing so we are trying to think of better ways to handle this but still have typing/autocomplete features.

Thanks for your question!

evonneng avatar Oct 10 '22 21:10 evonneng

Thanks but it starts the training again with the command you rpovided me, the same problem i had

francescofugazzi avatar Oct 11 '22 08:10 francescofugazzi

A fix to this was merged in #741. You will need to clone the repo for this fix, it is not yet in the pip package.

tancik avatar Oct 11 '22 15:10 tancik

i updated and run setup.py again. Same result [trainer.py:277] No checkpoints to load, training from scratch Even by providing the checkpoint it starts everything again

francescofugazzi avatar Oct 11 '22 18:10 francescofugazzi

Hi I also encountered the same issue. I believe it's a bug because the config is over-write by the load_config in the train script and the trainer.load_dir in the old config is None (default). I made a small change and it can successfully load the previous checkpoint.

https://github.com/nerfstudio-project/nerfstudio/blob/d9f6426e0da555d42ad3017850a52dad8e5f7749/scripts/train.py#L226-L228

Save out the config.trainer.load_dir first and put it back after yaml.load:

    if config.trainer.load_config:
        CONSOLE.log(f"Loading pre-set config from: {config.trainer.load_config}")
        if config.trainer.load_dir is not None:
            load_dir = config.trainer.load_dir
        else:
            load_dir = None
        config = yaml.load(config.trainer.load_config.read_text(), Loader=yaml.Loader)
        if load_dir is not None:
            config.trainer.load_dir = load_dir

Same happens to config.viewer.start_train

liu115 avatar Nov 01 '22 10:11 liu115