donfig icon indicating copy to clipboard operation
donfig copied to clipboard

Feature request: mark certain config keys as required

Open jhamman opened this issue 3 years ago • 6 comments

Is it in scope for Donfig to mark certain config keys as required? My use case is one where I'd like to require a user to specify some keys as part of their local config (like how Git requires you to configure your user name and email).

Two potential APIs that would work for my use case:

  1. Insert a sentinel value into defaults
from donfig import Config, required  # sentinel value

config = Config(
    'my_lib',
    defaults = [{
        'user': {'name': required, 'name': required},
        'foo': {'bar': None},
    }]
)
  1. A separate constructor arg for required keys:
config = Config(
    'my_lib',
    defaults = [{
        'user': {},
        'foo': {'bar': None},
    }],
    required=['user.name', 'user.email']
)

I'd like to raise an error if a user required value is not found in any of the user configs (or environment variables) during setup of the Config class.

jhamman avatar Oct 13 '22 22:10 jhamman

I think so. I know it has long been discussed in dask to add a schema argument or YAML that gets parsed (and cached I think) where you could do required keys and even enforce data types. Anything added to donfig should be influenced if not be completely inline with what dask has planned. We could even be their guinea pigs on trying it out if they want.

I'm on mobile so can't easily link the related dask and distributed issues.

djhoese avatar Oct 13 '22 23:10 djhoese

I think https://github.com/dask/dask/issues/5695 and https://github.com/dask/dask/pull/6456 are both relevant here.

jhamman avatar Oct 14 '22 14:10 jhamman

@jsignell What are your feelings on schema/validation stuff with dask config? Was getting a fast import time too difficult that it didn't seem worth it for what dask needed? Were there things you wanted to try but didn't have time for? Were there too many edge cases that it would have taken more time than you wanted to spend on it?

djhoese avatar Oct 14 '22 14:10 djhoese

Yeah those efforts kind of fell by the wayside in dask. There just wasn't much enthusiasm in the first place so basically any impact on import time seemed like too much. I still think it would be nice to have, but it's not anyone's priority.

jsignell avatar Oct 20 '22 13:10 jsignell

I've been continuing to think about this issue and I'm starting to think using a third-party validation framework like Pydantic may be a nice way to go. If Donfig supported plugging in different underlying containers for the config settings, we could probably sub in a Pydantic model in place of the dict here:

https://github.com/pytroll/donfig/blob/8169f5c1f7d7bb6ea058aca0ea9bb1e182e1c0b2/donfig/config_obj.py#L367

What do folks think about this:

from pydantic import BaseModel
from donfig import Config

class SettingsModel(BaseModel):
    log_level: int = 10
    option: str = 'foo'

config = Config('my_project', config_factory=SettingsModel)

config.set({"log_level": "debug"}) # --> raise error (not castable to an int

In this way, Pydantic could be used to:

  • define the config schema (exportable to json via SettingsModel.schema())
  • validate settings at runtime

jhamman avatar Apr 04 '23 15:04 jhamman

I like the idea of depending on another library, especially one concerned about performance, with handling the schema and validation. An alternative to pydantic might be msgspec:

https://jcristharif.com/msgspec/

I have only a little experience with pydantic, but might be nice. It'd be nice the way you describe it if we can "duck type" the config_factory with the default being a dict and then look for standard methods for certain functionality if needed (validation, etc).

djhoese avatar Apr 04 '23 19:04 djhoese