nerfstudio icon indicating copy to clipboard operation
nerfstudio copied to clipboard

Load config from yaml file and allow to overwrite arguments in the train.py script

Open pschroeppel opened this issue 2 years ago • 4 comments

Hi, with the scripts/train.py script, it is currently possible to load a config from a yaml file via the --load-config path/to/config.yaml argument. However, when this is done, all other command line arguments are ignored.

For example calling:

python scripts/train.py vanilla-nerf --load-config=/path/to/config.yaml --vis=viewer blender-data

would use a tensorboard visualization, if this is what was indicated in the config.yaml. Instead of this, in my opinion it would be nice to use the config.yaml as defaults, but overwrite specific values based on given command line arguments.

I'm not experienced with tyro, so I find it difficult to implement this. Is there an easy way to achieve what I want? best, Philipp

pschroeppel avatar Jan 24 '23 16:01 pschroeppel

Hi!

This is a good idea and sounds doable. One option is to:

  1. Load the contents of the YAML into a config object before any of the training script'styro logic.
  2. Pass a default=the_loaded_config_object argument into the tyro.cli() call here: https://github.com/nerfstudio-project/nerfstudio/blob/828d6bc893683042b3763247b65b3adeb5d5ea7e/scripts/train.py#L248-L251 This will tell tyro to read default values from the loaded object.

The question then becomes: how do we get the path to the YAML config before tyro.cli() is called? Two options I'm thinking are:

  1. We could remove load_config from the TrainerConfig object entirely, and instead use an environment variable. This would avoid the chicken/egg problem where load_config is in the config, but the config should be instantiated based on the contents of load_config.

    Usage would then look something like:

    NS_LOAD_CONFIG=../some/path/to/config.yml python scripts/train.py vanilla-nerf --vis=viewer blender-data
    
  2. A little more user-friendly: we could preemptively parse --load-config from sys.argv before any of the other fields. argparse + parse_known_args() might be useful for this.

Does this make sense?

brentyi avatar Jan 25 '23 05:01 brentyi

Hi, I tried now to implement option 2 (preemptively parse --load-config from sys.argv). For this I changed the entrypoint function in scripts/train.py as follows:

def entrypoint():
    """Entrypoint for use with pyproject scripts."""
    # Choose a base configuration and override values.
    config_file_parser = argparse.ArgumentParser("Parser for loading config from file")
    config_file_parser.add_argument('--load-config', type=Path, dest='config_path')
    args, _ = config_file_parser.parse_known_args()
    
    if args.config_path:
        CONSOLE.log(f"Loading pre-set config from: {args.config_path}")
        loaded_config = yaml.load(args.config_path.read_text(), Loader=yaml.Loader)
    else:
        loaded_config = None
    
    tyro.extras.set_accent_color("bright_yellow")
    main(
        tyro.cli(
            AnnotatedBaseConfigUnion,
            description=convert_markup_to_ansi(__doc__),
            default=loaded_config,
        )
    )

However, tyro doesn't accept the loaded_config and fails. It first gives warnings:

/misc/lmbraid18/schroepp/virtualenvs/miniconda3/envs/nerfstudio-rtx/lib/python3.8/site-packages/tyro/_fields.py:385: UserWarning: Could not find field __tyro_dummy_field__ in default instance TrainerConfig

/misc/lmbraid18/schroepp/virtualenvs/miniconda3/envs/nerfstudio-rtx/lib/python3.8/site-packages/tyro/_fields.py:677: UserWarning: Mutable type <class 'nerfstudio.engine.trainer.TrainerConfig'> is used as a default value for `__tyro_dummy_field__`. This is dangerous! Consider using `dataclasses.field(default_factory=...)` or marking <class 'nerfstudio.engine.trainer.TrainerConfig'> as frozen.

and then fails with an error message starting with:

Exception has occurred: AssertionError
`` was provided a default value of type <class 'nerfstudio.engine.trainer.TrainerConfig'> but no matching subcommand was found. A type may be missing in the Union type declaration for ``, which is currently set to typing.Union[typing_extensions.Annotated[nerfstudio.engine.trainer.TrainerConfig, _SubcommandConfiguration(
...

I'm not experienced with tyro and have difficulties understanding the problem. Do you know what is going on? best, Philipp

pschroeppel avatar Jan 27 '23 11:01 pschroeppel

Hi Philipp — thanks for taking the time to try this.

I experimented a bit with your implementation, and it seems like the sophisticated subcommands in nerfstudio make this problem harder than I initially imagined[^1]. This should be solvable, but to get something we'd be happy with I think I'll need to sit down with it for at least a few hours. A different solution or some upstream updates to tyro may be needed.

Added to my to-do list and I'll try to get to this this coming week, in the meantime if overriding a YAML is urgent it seems like a user can just copy/edit the YAML?

[^1]: To elaborate: when we pass in a default for a type with subcommands, tyro needs to figure out which subcommand the default corresponds to (is this config yaml from ns-train nerfacto? ns-train depth-nerfacto? ns-train instant-ngp?). For the vast majority of cases this matching is straightforward, but nerfstudio's subcommands happen to not fall into this category.

brentyi avatar Jan 28 '23 09:01 brentyi

Hi, alright, thanks a lot for looking into this! And it's not urgent at all, as it's easy enough to circumvent the problem. It's more a matter of convenience :) best, Philipp

pschroeppel avatar Jan 28 '23 13:01 pschroeppel

Brief update: with https://github.com/brentyi/tyro/pull/33 merged and bundled in the latest release of tyro we are on track for getting this issue resolved. Wrapping up is on my to-do list but not sure when I'll get to it. 🥲

brentyi avatar Feb 22 '23 03:02 brentyi