mlops-python-package icon indicating copy to clipboard operation
mlops-python-package copied to clipboard

pydantic-settings as an alternative

Open martinkozle opened this issue 1 year ago • 5 comments
trafficstars

Hello, I recommend checking out pydantic-settings as an ternative to omegaconf + pydantic.

martinkozle avatar Jul 12 '24 21:07 martinkozle

I find this module a bit too magical, and I'm not sure it supports deep merging like omegaconf https://omegaconf.readthedocs.io/en/2.3_branch/usage.html#omegaconf-merge. Would you have an example with pydantic-settings to share?

fmind avatar Jul 26 '24 20:07 fmind

I recreated the OmegaConf merge example using pydantic-settings with some extra examples:

import os
import sys

import yaml
from pydantic import BaseModel
from pydantic_settings import (
    BaseSettings,
    PydanticBaseSettingsSource,
    SettingsConfigDict,
    YamlConfigSettingsSource,
)


class Server(BaseModel):
    port: int


class Log(BaseModel):
    file: str


class Settings(BaseSettings, cli_parse_args=True):
    model_config = SettingsConfigDict(
        yaml_file=["./example2.yaml", "./example3.yaml"], env_nested_delimiter="__"
    )

    server: Server
    users: list[str]
    log: Log

    @classmethod
    def settings_customise_sources(
        cls,
        settings_cls: type[BaseSettings],
        init_settings: PydanticBaseSettingsSource,
        env_settings: PydanticBaseSettingsSource,
        dotenv_settings: PydanticBaseSettingsSource,
        file_secret_settings: PydanticBaseSettingsSource,
    ) -> tuple[PydanticBaseSettingsSource, ...]:
        return (init_settings, env_settings, YamlConfigSettingsSource(settings_cls))


settings = Settings()
print(settings)
# server=Server(port=80) users=['user1', 'user2'] log=Log(file='log.txt')

print(settings.model_dump())
# {'server': {'port': 80}, 'users': ['user1', 'user2'], 'log': {'file': 'log.txt'}}


sys.argv = ["merge_example.py", "--server.port", "82"]
settings_with_cli = Settings()
print(settings_with_cli)
# server=Server(port=82) users=['user1', 'user2'] log=Log(file='log.txt')

print(yaml.dump(settings_with_cli.model_dump()))
"""
log:
  file: log.txt
server:
  port: 82
users:
- user1
- user2
"""


settings_with_init = Settings(users=["user3", "user4"])
print(settings_with_init)
# server=Server(port=82) users=['user3', 'user4'] log=Log(file='log.txt')


os.environ["LOG__FILE"] = "log2.txt"
settings_with_env = Settings()
print(settings_with_env)
# server=Server(port=82) users=['user1', 'user2'] log=Log(file='log2.txt')

The comments represent the standard output for each print.

The priority is defined as the order of the tuple elements returned by settings_customise_sources.

If you need quick non-validated loading of configs then OmegaConf seems easier. But if you are going to define Pydantic models anyways to validate the loaded OmegaConf, then you could do both in a declarative way using pydantic-settings.

Which one is easier to read and understand I would say is a subjective opinion and depends on what you are used to. But functionality wise I would definitely say it is an alternative as it supports a variety of formats as well.

martinkozle avatar Jul 27 '24 09:07 martinkozle

Thanks for the complete example @martinkozle. I had this discussion with my colleagues, and we found two ways of integrating external settings:

  1. Load settings "statically" from configs files (+ environment): this is best suited for applications (e.g., web with well-defined environments) there most settings are known in advanced. This seems to be the best setup with pydantic-settings.
  2. Load settings "dynamically" from configs files: this is best suited when configs are not known in advanced (i.e., Config files can be changed up the startup). I'm not sure pydantic-settings can handle this setup well.

I look at the doc, https://docs.pydantic.dev/latest/concepts/pydantic_settings, and it seems the list of files is static (I.e., you cannot change them from the command-line). For me this is an issue, as in most of my past experience we generated tons of config files and we would adjust the run using them. Thus, I think OmegaConf enables more flexibility, even if settings are not validated upfront.

I would propose to integrate your snippet in a Gist and reference it in the package. If Pydantic provides from dynamic options from the command-line, I'll be happy to integrate them. What do you think?

fmind avatar Jul 28 '24 18:07 fmind

Aha, I see. I haven't needed that feature so far, so I haven't thought about it.

One workaround, which I don't quite like because it involves mutation would be:

Settings.model_config["yaml_file"] = [
    "./example4.yaml",
    "./example5.yaml",
    "./example2.yaml",
]
settings = Settings()

Or sub-typing and overriding the model_config.

If only there was a way when instantiating the object to be able to override yaml_file, similar to how the env file can be overridden currently.

martinkozle avatar Jul 29 '24 09:07 martinkozle

Interesting snippet, I haven't considered this option.

I'm going to use Pydantic on a new project, and I'll test pydantic-settings. Let's see how it goes.

fmind avatar Jul 31 '24 17:07 fmind

I had the opportunity to test Pydantic Settings in another project. It's really cool, thanks for the highlight! However, I think it will require too much effort to incorporate it in this repository, as I would need to revamp the whole config systems. Moreover, I'm not sure it can merge YAML file as proposed currently. Thus, I'm closing the issue.

https://github.com/fmind/bromate

fmind avatar Sep 13 '24 19:09 fmind