pydantic-settings icon indicating copy to clipboard operation
pydantic-settings copied to clipboard

Improved handling of setting sources

Open Geosynopsis opened this issue 2 years ago • 2 comments

Hey, In this issue, I'd like to propose a general improvement to the pydantic-settings, especially the handling of the sources. If it's agreeable, will create a PR. The current implementation of the source settings consists of the following three sources:

  • InitSettingSource
  • EnvSettingSource
  • SecretSettingSource

Adding more sources and changing the order of sources requires patching the customise_sources method of Config class. Additionally, most of the logic in the EnvSettingSource and SecretSettingSource is transferrable to any kind of KeyValue source and Directory&File based source respectively.

Therefore, I'd propose that we:

  • Separate settings source access pattern and parsing:
    • Setting source access pattern could be made uniform by accepting any object that has an Immutable Mapping interface. os.environ exposes this already, for the local secret directory we would implement one. For other exotic sources like remote password managers, the user can implement it for themselves.
    • The mapping logic would be the same as EnvironmentSettings. However, some attributes need renaming to reflect the changes. Eg. env_nested_delimiter -> nested_delimiter, env_prefix -> prefix etc.
  • Declare a public interface to register the sources (Could be a sequence of source accessors in the Config class itself)

An example of how it could look:

import inspect
import os
from typing import ClassVar, Dict, List, Mapping, Optional, Type, Union

from pydantic import BaseModel


def parse_sources(sources: List[Mapping], setting: 'BaseSettings'):
    # Simplistic view on how parsing could look like.
    result = {}
    for field in setting.__fields__.values():
        for source in sources:
            value = source.get(field.name)
            if value:
                result[field.alias] = value
                break
    return result

get_env_source = lambda: os.environ

def get_env_file_source(file_path: Union[str, os.PathLike], encoding:str, case_sensitive:bool = False):
    try:
        from dotenv import dotenv_values
    except ImportError as e:
        raise ImportError('python-dotenv is not installed, run `pip install pydantic[dotenv]`') from e

    file_vars: Dict[str, Optional[str]] = dotenv_values(file_path, encoding=encoding or 'utf8')
    if not case_sensitive:
        return {k.lower(): v for k, v in file_vars.items()}
    else:
        return file_vars


class BaseSettings(BaseModel):
    def _build_values(self, init_kwargs: Dict[str, Any], source_kwargs: Dict[str, Any]):
        sources = [init_kwargs]
        for source_provier in self.__config__.source_providers:
            signature = inspect.signature(source_provier)
            kwargs = {}
            for parameter in signature.parameters.values():
                # Checks if the required parameter is provided to either of
                # build_values or Config class. In config class attribute name
                # is same as the parameter name while in the build_values
                # function it expects to be prefixed with _
                value = source_kwargs.get(f"_{parameter.name}", getattr(self.__config__, parameter.name))
                if not value and isinstance(parameter.default, inspect._empty):
                    raise ValueError("Function `{source_provider}`: Missing value for required parameter `{parameter.name}`")
                kwargs[parameter.name] = value
            sources.append(source_provier(**kwargs))
        return parse_sources(sources, self)

    class Config:
        prefix: str = ''
        env_file: Optional[str]
        env_file_encoding: Optional[str]
        nested_delimiter: Optional[str]
        source_providers = [
            get_env_source,
            get_env_file_source,
            ...
        ]

    __config__: ClassVar[Type[Config]]

Geosynopsis avatar Oct 09 '22 14:10 Geosynopsis

I agree with the basic idea here, happy to review a PR.

A few things to note here:

  • os.environ as a Mapping interface isn't quite enough in itself since we need to deal with the (default) case-insensitive case
  • also with delimited env vars e.g. FOO__BAR__BAZ
  • also with JSON encoded env. vars. and with an alternative to JSON encoding, like comma separated lists - basically how we parse env vars for complex fields needs to be configurable
  • also need to think about which functionality from the other pydantic-settings package we need to support - I've agreed that we'll take on the useful bits of that package

In short, I'm happy to review a PR, but I think we need to do more thing about what we want the interface and capabilities to be before we start writing library code.

samuelcolvin avatar Oct 12 '22 09:10 samuelcolvin

Yes, the issues, you have raised, need to be dealt with. The example only shows high-level architectural components and is intentionally simplified.

However, to make the behaviour the same across the different containers (mapping interfaces), I would propose dealing with the parsing logic (including the case-sensitivity, nesting, complex fields etc.) on mapping logic. It opens up a way for the users to maintain consistency between the different setting backends. Eg. my envvars could be an exact copy of the key-values in Kubernetes secret. I'm not sure as of now, how it would impact the performance.

Encoding has to be dealt with on the data container side as you've pointed out however that won't block us from exposing all containers with the Mapping interface.

Regarding migrating some of the old pydantic-settings, what do you have in mind? Do you already have a few functionalities in mind you would like to take ahead? How critical is it to be backwards-compatible for old pydantic-settings?

Geosynopsis avatar Oct 15 '22 09:10 Geosynopsis

Fixed in https://github.com/pydantic/pydantic-settings/commit/0ecac227b2814983f426bd1d8b4b3d60dbffac8b

hramezani avatar Apr 26 '23 12:04 hramezani