nerfstudio
nerfstudio copied to clipboard
Breakup base.py config file
The base.py
config file has become to goto place to dump all config files. This should be refactored in a way to make it easier to work with.
Two options:
- Place the configs in the same files as the classes they instantiate.
- Create a set config files for each type class, ie
dataset_configs.py
,model_configs.py
, ect.
I'm a fan of the first option; to bring up an old issue though, one annoying thing is that the way _target
is set up right now means putting configs in the same files as the classes they instantiate would require either putting the config after the class or using default_factory=
:
@dataclass
class BlenderDatasetConfig(DatasetConfig):
_target: Type = field(default_factory=lambda: Blender) # We can't write `Blender` directly, because `Blender` doesn't exist yet
data_directory: Path = Path("data/blender/lego")
...
class Blender:
...
Some potential fixes/alternatives, partially pasted from old conversations with Evonne+Ethan:
(1) @Config.register_target
decorator
T = TypeVar("T", bound=Type)
class InstantiateConfig:
def setup(...) -> ...: ...
@classmethod
def register_target(cls, model_cls: T) -> T:
cls._target = model_cls
return model_cls
@dataclasses.dataclass
class PipelineConfig(InstantiateConfig):
...
@PipelineConfig.register_target
class Pipeline:
...
(2) A typed alternative with typing.Generic
config.setup()
currently always statically evaluates to Any
.
For typing config.setup()
more strongly, we also have a mockup of an API that looks like:
@dataclasses.dataclass
class ModelConfig(
InstantiateConfig["Model"] # Quotes here indicate a forward reference.
):
layers: int
class Model:
...
Using this we can get the .setup()
method to statically evaluate to the expected type:
(3) Getting rid of config objects entirely?
Slightly more radical, but by pushing some typing.ParamSpec
objects around we could make an Args[SomeObject]
type that inherits fields from the (type-annotated) constructor of SomeObject
and type checks + autocompletes reasonably well.
A mockup:
All of these options would be easy enough for us to add. The primary question is what would be easiest to anyone using our repo for the first time...
(3) is nice since the params are defined at the location of the class (and is what people normally do...) Do you think it would be possible to get something like option 3, but where we have some function can take in the class and return a Config like the code in lines 7-14?
One thing with (3) that we'd lose if we got rid of the separate dataclasses is some inheritance succinctness: https://github.com/plenoptix/nerfactory/blob/6f3518d54f28d477e90c7d0b480891b675b97893/nerfactory/models/instant_ngp.py#L53-L58
Being able to just write self.config = config
(as opposed to repeating self.config_name = config_name
for every single configurable value) and pass the config directly to super().__init__()
without worrying about the contents as currently done is pretty nice.
Do you think it would be possible to get something like option 3, but where we have some function can take in the class and return a Config like the code in lines 7-14?
Do you have an example of what this would look like/how it would be used? Not totally following this.
Being able to just write self.config = config (as opposed to repeating self.config_name = config_name for every single configurable value) and pass the config directly to super().init() without worrying about the contents as currently done is pretty nice.
Hmm, yea that is a nice property. But aren't people pretty used to passing each configurable value? Also when it is just a config value being passed around, the docstrings and configurable values get dislocated from the class which actually uses them.
Do you have an example of what this would look like/how it would be used? Not totally following this.
I though more about this, it doesn't make sense, ignore ;)
For co-locating docstrings and configurable values with the class that uses them, a pattern like this seems OK:
# model.py
@dataclass
class ModelConfig:
num_layers: int
“””Some docstring”””
units: int
“””Some other docstring”””
class Model:
def __init__(self, config: ModelConfig):
# use config.num_layers, config.units here
The docstring experience also might be a better experience:
- When there are a lot of fields, dataclass documentation is less tedious than method documentation with the
Args:
thing where you have to re-list out the whole argument list and keep it synchronized. - If
ModelConfig
is a subclass, it’ll inherit docstrings for fields in the base class without extra effort (eg in IDE+autocomplete popups). If we get rid of the dataclass, documentation for arguments in superclasses, which is still relevant for subclasses, needs to be either omitted or repeated.
On Python inheritance in general, one thing I’m wondering about subclass constructors without dataclasses is if we’re forced to choose between:
- Listing out every single argument used by the base class constructor in the subclass constructor, then passing them in one-by-one via
super().__init__(arg1=arg1, arg2=arg2, arg3=arg3, …)
. - Or, alternatively, attaching a trailing
**kwargs
field to the subclass constructor, and callingsuper().__init__(**kwargs)
.’
Both seem difficult to scale; the first is terribly repetitive and the second can’t be type checked. Is there a better way to do this?
I generally like the (3) Args[Model]
approach, just wouldn’t want to lose the succinctness + type friendly benefits of the current pattern.
(4) We could also emulate Implicitron’s config pattern
They make every module a dataclass where the dataclass fields are the configurable values and the usual __init__
logic is moved into __post_init__
; combined with Args[]
, it seems like this could get us type checking + autocomplete and nice inheritance syntax without auxiliary config classes.
Downsides would be:
- Slightly unconventional.
-
Args[]
(which we could also callLazy[]
? it’s just an object that’s defined but not actually instantiated) is much more magical + potentially difficult to understand than an explicit config dataclass. - Maybe there are situations where having a config dataclass is actually really nice?
Ok, so maybe for now we should just keep what we have but colocate the configs with the classes.