pydantic-settings Add CLI Settings Source

Resolves #209.

This PR implements a CliSettingsSource with the following main features outlined below. Refer to docs/index.md for more details.

Current Examples

Fields

Same expectations as setting environment variables
Proposed _cli_parse_args can accept arg array or bool set to True for sys.argv

from pydantic import BaseModel

from pydantic_settings import BaseSettings


class DeepSubModel(BaseModel):
    v4: str


class SubModel(BaseModel):
    v1: str
    v2: bytes
    v3: int
    deep: DeepSubModel


class Settings(BaseSettings):
    v0: str
    sub_model: SubModel


print(Settings(_cli_parse_args=[
    '--v0=0',
    '--sub_model={"v1": "json-1", "v2": "json-2"}',
    '--sub_model.v2', 'nested-2',
    '--sub_model.v3', '3',
    '--sub_model.deep.v4', 'v4'
]).model_dump())
'''
{'v0': '0', 'sub_model': {'v1': 'json-1', 'v2': b'nested-2', 'v3': 3, 'deep': {'v4': 'v4'}}}
'''

Lists

Proposed intermixing of three list styles:
- JSON style --field='[1,2,3]'
- Argparse style --field 1 --field 2 --field 3
- Lazy style --field=1,2,3

from pydantic import BaseModel
from typing import List
from pydantic_settings import BaseSettings


class Settings(BaseSettings):
    my_list: List[int]

print(Settings(_cli_parse_args=[
    '--my_list', '[1,2]',
    '--my_list', '3,4',
    '--my_list', '5', '--my_list', '6'
]).model_dump())
'''
{'my_list': [1, 2, 3, 4, 5, 6]}
'''

Dictionaries

Proposed intermixing of two dictionary styles:
- JSON style --field='{"k1": 1, "k2": 2}'
- Env Var style --field k1=1 --field k2=2
These styles can be used in conjunction with list forms as well:
- --field k1=1,k2=2 --field k3=3 --field '{"k4: 4}' etc.

from pydantic import BaseModel
from typing import Dict
from pydantic_settings import BaseSettings


class Settings(BaseSettings):
    my_dict: Dict[str, str]


print(Settings(_cli_parse_args=[
    '--my_dict', '{"k1": "a", "k2": "b"}',
    '--my_dict', '{"k3": "c"}, {"k4": "d"}',
    '--my_dict', '{"k5": "e"}', '--my_dict', '{"k6": "f"}',
    '--my_dict', '[k7=g,k8=h]',
    '--my_dict', 'k9=i,k10=j',
    '--my_dict', 'k11=k', '--my_dict', 'k12=l',
]).model_dump())
'''
{'my_dict': {
    'k1': 'a',
    'k2': 'b',
    'k3': 'c',
    'k4': 'd',
    'k5': 'e',
    'k6': 'f',
    'k7': 'g',
    'k8': 'h',
    'k9': 'i',
    'k10': 'j',
    'k11': 'k',
    'k12': 'l',
}}
'''

Subcommands and Positionals

Proposed help text generation based off of field descriptions and class documentation
Proposed _cli_prog_name has same meaning as in argparse
Proposed CliSubCommand annotation for marking subcommands
Proposed CliPositionalArg annotation for marking positional arguments

from pydantic import BaseModel
from typing import Tuple, Type
from pydantic_settings import BaseSettings, CliSubCommand, CliPositionalArg, PydanticBaseSettingsSource


class FooPlugin(BaseModel, use_attribute_docstrings=True):
    '''git-plugins-foo - Extra deep foo plugin command'''

    my_feature: bool = False
    '''Enable my feature on foo plugin'''


class BarPlugin(BaseModel, use_attribute_docstrings=True):
    '''git-plugins-bar - Extra deep bar plugin command'''

    my_feature: bool = False
    '''Enable my feature on bar plugin'''


class Plugins(BaseModel, use_attribute_docstrings=True):
    '''git-plugins - Fake plugins for GIT'''

    foo: CliSubCommand[FooPlugin]
    '''Foo is fake plugin'''

    bar: CliSubCommand[BarPlugin]
    '''Bar is also a fake plugin'''


class Clone(BaseModel, use_attribute_docstrings=True):
    '''git-clone - Clone a repository into a new directory'''

    repository: CliPositionalArg[str]
    '''The repository to clone'''

    directory: CliPositionalArg[str]
    '''The directory to clone into'''

    local: bool = False
    '''When the resposity to clone from is on a local machine, bypass ...'''

    shared: bool = False
    '''Force the clone process from a reposity on a local filesystem ...'''


class Init(BaseModel, use_attribute_docstrings=True):
    '''git-init - Create an empty Git repository or reinitialize an existing one'''

    directory: CliPositionalArg[str]
    '''The directory to clone into'''

    quiet: bool = False
    '''Only print error and warning messages; all other input will be suppressed'''

    bare: bool = False
    '''Create a bare repository. If GIT_DIR environment is not set, it is set ...'''


class Git(BaseSettings, use_attribute_docstrings=True):
    '''git - The stupid content tracker'''

    clone: CliSubCommand[Clone]
    '''Clone a repository into a new directory'''

    init: CliSubCommand[Init]
    '''Create an empty Git repository or reinitialize an existing one'''

    plugins: CliSubCommand[Plugins]
    '''Fake GIT plugin commands'''


print(Git(_cli_prog_name='git', _cli_parse_args=['--help']))
'''
usage: git [-h] {clone,init,plugins} ...

git - The stupid content tracker

options:
  -h, --help            show this help message and exit

subcommands:
  {clone,init,plugins}
    clone               Clone a repository into a new directory
    init                Create an empty Git repository or reinitialize an existing one
    plugins             Fake GIT plugin commands
'''


print(Git(_cli_prog_name='git', _cli_parse_args=['clone', '--help']))
'''
usage: git clone [-h] [--local bool] [--shared bool] REPOSITORY DIRECTORY

git-clone - Clone a repository into a new directory

positional arguments:
  REPOSITORY     The repository to clone
  DIRECTORY      The directory to clone into

options:
  -h, --help     show this help message and exit
  --local bool   When the resposity to clone from is on a local machine, bypass ...
  --shared bool  Force the clone process from a reposity on a local filesystem ...
'''


print(Git(_cli_prog_name='git', _cli_parse_args=['plugins', 'bar', '--help']))
'''
usage: git plugins bar [-h] [--my_feature bool]

git-plugins-bar - Extra deep bar plugin command

options:
  -h, --help         show this help message and exit
  --my_feature bool  Enable my feature on bar plugin
'''

Jan 22 '24 05:01 kschwab

@hramezani and @samuelcolvin

I am most of the way through but hit this limitation in the underlying environment variable parsing:

JSON is only parsed in top-level fields, if you need to parse JSON in sub-models, you will need to implement validators on those models.

Obviously, this would be very nice to have from a CLI perspective (e.g., nested lists etc.). I didn't know this was the case until I started writing nested tests. Do you have suggestions on what the best course of action is here? Ideally, I would love for it to be handled in environment parsing. If not, from CLI perspective the next easiest path would be nested dictionaries of strings. Is that something I can throw to Pydantic and have a reasonable expectation it will sort it out?

Jan 22 '24 06:01 kschwab

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (40230ab) 97.69% compared to head (cb9c1c3) 98.65%. Report is 5 commits behind head on main.

:exclamation: Current head cb9c1c3 differs from pull request most recent head a984f32. Consider uploading reports for the commit a984f32 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #214      +/-   ##
==========================================
+ Coverage   97.69%   98.65%   +0.96%     
==========================================
  Files           5        5              
  Lines         347      597     +250     
  Branches       76      152      +76     
==========================================
+ Hits          339      589     +250     
  Misses          6        6              
  Partials        2        2

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Jan 22 '24 06:01 codecov[bot]

Thanks @kschwab for this patch 🙏

Regarding JSON parsing issue please take the easiest path.

Also, you need to add documentation for this and show how to use this new feature. I can see you've added different configs, please add some examples for them in the doc to show what is the use case of them.

Jan 22 '24 11:01 hramezani

Sorry @hramezani, I should have been more specific with my question. I am at a fork and am looking for guidance on which direction to take.

Is there any background on why nested JSON isn't supported in EnvSettingsSource? Was there something that scared us away from adding it? Is it a massive pile of work? etc.

My preference would be to solve it there so that the CLI and ENV are consistent. Just trying to ascertain if that is a reasonable direction or not.

Jan 22 '24 17:01 kschwab

I think the comment that you brought from the doc is an old one and related to the first time that nested env support added to Pydantic V1.

Here is my understanding from the comment: if you have a settings class like

class SubValue(BaseModel):
    v4: str
    v5: int

class TopValue(BaseModel):
    v1: str
    v2: str
    v3: str
    sub: SubValue

class Cfg(BaseSettings):
    v0: str
    top: TopValue
    model_config = SettingsConfigDict(env_nested_delimiter='__')

you only can provide JSON values for Cfg.top. e.g.

V0=test TOP='{"v1": "json-1", "v2": "nested-2", "v3": "3", "sub": {"v4": "v4", "v5": 5}}'

and it is not possible to provide JSON values for sub fields like:

V0=test TOP='{"v1": "json-1", "v2": "nested-2", "v3": "3"}' TOP__SUB='{"v4": "v4", "v5": 5}'

But, you can see that passing JSON values for sub fields also works. So, Probably we can remove it.

Jan 22 '24 17:01 hramezani

Got it. So it's either a bug or something invalid in the test case. Below is an example of the issue I am seeing:

import os
from pydantic import BaseModel
from typing import Optional, List
from pydantic_settings import BaseSettings


class Child(BaseModel):
    num_list: Optional[List[int]] = None


class Cfg(BaseSettings, env_nested_delimiter='__'):
    child: Optional[Child] = None


os.environ['CHILD__NUM_LIST'] = '[1,2,3]'
print(Cfg().model_dump())
'''
Traceback (most recent call last):
  File "/home/kschwab/pydantic-settings/example.py", line 23, in <module>
    print(Cfg().model_dump())
  File "/home/kschwab/pydantic-settings/pydantic_settings/main.py", line 93, in __init__
    super().__init__(
  File "/home/kschwab/.local/lib/python3.10/site-packages/pydantic/main.py", line 171, in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Cfg
child.num_list
  Input should be a valid list [type=list_type, input_value='[1,2,3]', input_type=str]
    For further information visit https://errors.pydantic.dev/2.6/v/list_type
'''

If you note the input_value='[1,2,3]' in the validation error it got the correct input, it just did not convert it into a list. This is what led me to believe that nested JSON must not be supported.

Jan 22 '24 19:01 kschwab

It's a bug. Here is the related issue

Jan 22 '24 20:01 hramezani

Well, not great that it's a bug, but good in the context of this PR since it allows us to keep using EnvSettingsSource as the backbone. We'll have to address that issue first before this PR goes in.

I'll continue on with rest of documentation, cleanup, etc. and loop back to it if it's still there.

Jan 22 '24 23:01 kschwab

The bulk of the functionality is now complete. Will continue to work on cleanup, comments, docs etc. over next few days. The only key remaining feedback items needed for any adjustments are:

Subcommand and Positional Argument Implementation (refer to GIT CLI example for reference)
- [ ] Does the CliSubCommand and CliPositionalArg annotations for denoting their respective meanings look OK?
  - There are a few ways to accomplish this. I took the stance of a more explicit annotation at the user level for denoting subcommands and positional arguments. Overall, it is quite clean and makes intent clear when reading the model. The alternative approach is implicit coercion, with union fields into subcommands and required fields into positionals. This takes flexibility away from user, who may want a similar experience to that as environment variable overrides. Therefore, the default experience is in line with how Pydantic is already structured, and deviation from the expectation is expressed through these annotations.
- [ ] Is single subparser per model OK?
  - I took an opinionated stance of only supporting a single subcommand group (single subparser) per model. If we want to support multiple subparsers, we would likely need to go back to an earlier version I had here. I changed from that direction however because:
    - Better control of generated help text. With a union implementation, the help text for the subcommand gets pulled from the class doc, which works so long as it is not long. However, in our application, we work with larger models that do have verbose class documentation so it can flood the help text. Simple workaround would be parsing the first sentence of the class doc etc. but not worth it IMO. With current implementation, the subcommand help text is the field doc string making it very straightforward. Then, the class doc itself is displayed when using help on the subcommand itself. e.g. git --help shows condensed help for the clone subcommand, whereas git clone --help shows the expanded class doc help text there.
    - Explicit naming of subcommands. In union implementation, the subcommand name must be coerced from the class name or some other mechanism. With current implementation, it is whatever the subcommand field name is.
    - Avoids the need for Discriminated Unions. Absolutely an option but requires more boilerplate, which takes away from the clean aesthetic. This was actually the forcing issue that caused me to change direction. The normal union selection was not strong enough so users would have needed to add discriminated unions for any subcommand.
- [ ] Positional arguments always placed before subcommands?
  - Another opinionated stance, although of smaller concern. As is today, there is no requirement with regards to field ordering within the definition of a model. However, the resultant CLI will reorder positional args before subcommands. This is mostly to prevent users from shooting themselves in the foot. Perhaps this is best to just change to a flag, i.e., turn the safety off?

Outside of the above, I think everything else is pretty straightforward and aligns with precedence already set by environment variables.

Jan 24 '24 02:01 kschwab

Hi @hramezani, need some help with resolving the below:

Many of the doc examples use CLI --help to view generated help text etc. However, this results in an exit, which pytest does not like. How to update pytest such that exit is allowed in these expected cases?
Many of the doc examples use the recently merged use_attribute_docstrings flag, which does not appear available in tests. How to update the tests such that the pydantic version includes this commit?
Is there a way to generate the docs locally for viewing?

Jan 28 '24 20:01 kschwab

Many of the doc examples use CLI --help to view generated help text etc. However, this results in an exit, which pytest does not like. How to update pytest such that exit is allowed in these expected cases?

You can use

```py test="skip"

to skip running on those blocks

Many of the doc examples use the https://github.com/pydantic/pydantic/pull/6563 use_attribute_docstrings flag, which does not appear available in tests. How to update the tests such that the pydantic version includes this commit?

This will be available on Pydantic 2.6. It is in beta and will be released soon. We are planing to release the next pydantic-settings after Pydantic 2.6. So, in the end, we can use use_attribute_docstrings flag.

Is there a way to generate the docs locally for viewing?

Not right now. pydantic-settings docs will be rendered under pydantic docs. But I will work on it to make it possible to render docs in pydantic-settings

Jan 29 '24 09:01 hramezani

Hi @hramezani, almost ready to move this into review. There are three final items:

I do not plan to add coverage for this line, unless there is a safe/easy way in pytest to modify sys.argv. Is this acceptable? If so, how to exclude?
I would like to exclude this function from coverage or leverage the original test case if you can help locate. I pulled this from pydantic._internal._repr.py. Please provide guidance on desired direction.
Also related to above number two, I am hitting a mypy linting issue related to typing_extensions.TypeAliasType here. This is directly copied from the _repr version, how would you like me to proceed?

Jan 30 '24 01:01 kschwab

I do not plan to add coverage for this line, unless there is a safe/easy way in pytest to modify sys.argv. Is this acceptable? If so, how to exclude?

Can't we use monkeypatch fixture like monkeypatch.setattr("sys.argv", ["test1", "-f", "test2"])?

I would like to exclude this function from coverage or leverage the original test case if you can help locate. I pulled this from pydantic._internal._repr.py. Please provide guidance on desired direction.

There are some tests in pydantic repo

https://github.com/pydantic/pydantic/blob/c1dff153c8ba2894e2b060cf6e53de6851ba4cd9/tests/test_utils.py#L83
https://github.com/pydantic/pydantic/blob/c1dff153c8ba2894e2b060cf6e53de6851ba4cd9/tests/test_utils.py#L106

Also related to above number two, I am hitting a mypy linting issue related to typing_extensions.TypeAliasType here. This is directly copied from the _repr version, how would you like me to proceed?

Please ignore it

Jan 30 '24 12:01 hramezani

Thanks @hramezani for the feedback. Ready for review 👍🏾

Jan 31 '24 03:01 kschwab

please update

Feb 16 '24 09:02 hramezani

@hramezani @samuelcolvin Thank you for the feedback, really appreciate it and will make updates accordingly. We found a couple of changes we'll also bring in after playing with it internally:

The enum support should be moved deeper into the environment variable parsing. Will open a separate issue and address it there.
We plan to integrate this with other existing parsers (one of which is pytest), so will be providing some support to "export" the internally generated parser and likely a pytest specific example. Still working on how to best accomplish this but will push once ready.

Feb 17 '24 19:02 kschwab

The enum support should be moved deeper into the environment variable parsing. Will open a separate issue and address it there.

Opened #252 to address this.

Mar 05 '24 05:03 kschwab

@hramezani @samuelcolvin @dmontagu updates are complete and ready for review.

Mar 24 '24 19:03 kschwab

@hramezani @samuelcolvin @dmontagu updates are complete and ready for review.

Thanks @kschwab for the update. We will review it later and probably this will be included in the next release

Mar 24 '24 21:03 hramezani

@kschwab Sorry for the late response. Finally, I found time to work on this and test it. The implementation is good and the doc is very well 👍

There are something that need to be fixed:

If you enable SettingsConfigDict(cli_parse_args=True), then the CLI source will be the first priority and there is no way to change the priority. This is not like other sources that we've added before like toml and yaml. you include them and change the priority as well. So, this needs to be changed and we should have the same behavior as other new sources.
It doesn't work with aliases. with the following model:

class Settings(BaseSettings):
    apple: str = Field(alias='alias')

    model_config = SettingsConfigDict(cli_parse_args=True)

we can't use --alias foo. As alias support is included in other sources, we have to have it in CLI as well.

It doesn't work with case_sensitive config and always expects the field name. With the following model:

class Settings(BaseSettings):
    foo: str

    model_config = SettingsConfigDict(cli_parse_args=True, case_sensitive=False)  # default is `False`

Then --FOO bar doesn't work. in env source, we convert all the env names to lowercase in when case_sensitive=False

It doesn't work well with dataclasses in sub model. for example with:

@pydantic_dataclasses.dataclass
class MyDataclass:
    foo: int

class Settings(BaseSettings):
    n: MyDataclass

You have to pass --n '{"foo": 123} but --n.foo 123 doesn't work. but it works for normal basemodel submodels

Thanks for your great work here ❤️

May 10 '24 09:05 hramezani

Thanks @hramezani for the review, we'll address the feedback items soon.

With respect to the CliSettingsSouce prioritization, originally, we had prioritization that could be changed but removed it after our discussion here. The conclusion was to use fixed prioritization so as to not break backwards compatibility, and then revisit dynamic prioritization once this issue is fixed in V3.

We can change it back to the original if you would prefer, let me know your thoughts.

May 14 '24 23:05 kschwab

Thanks @hramezani for the review, we'll address the feedback items soon.

With respect to the CliSettingsSouce prioritization, originally, we had prioritization that could be changed but removed it after our discussion here. The conclusion was to use fixed prioritization so as to not break backwards compatibility, and then revisit dynamic prioritization once this issue is fixed in V3.

We can change it back to the original if you would prefer, let me know your thoughts.

Yeah. that was my first thought. but after talking with the team we decided to have the CLI source like other sources. Can't we implement the new CLI source like other new sources(toml, yaml) and include it like them? it would be good to Sorry about that.

May 15 '24 09:05 hramezani

No worries!

Can't we implement the new CLI source like other new sources(toml, yaml) and include it like them?

I haven't looked at the latest toml, yaml source changes and how it was integrated, but will try and align with it if possible. I don't think there should be any issues. The only thing that comes to mind is the external parser support, but I think it should be fine. We'll make the updates and see where we land 🙂

May 15 '24 23:05 kschwab

@hramezani I've addressed the issues raised. For case-insensitive matching, this is not supported on external parsers, it is only supported on the internal parser.

May 24 '24 06:05 kschwab

@hramezani I've addressed the issues raised. For case-insensitive matching, this is not supported on external parsers, it is only supported on the internal parser.

Thanks @kschwab for addressing the issues. I will check it later.

May 24 '24 06:05 hramezani

Thanks @samuelcolvin for the review.

I added the differentiation with the last commit. Required fields are marked as (required) and optional fields are marked as (default: val). This also aligns with argparse ArgumentDefaultsHelpFormatter in how it delineates optional/default arguments.

usage: demo.py [-h] [--foo str] [--bar int]

Settings for the application.

options:
  -h, --help  show this help message and exit
  --foo str   Set the foo value. (required)
  --bar int   (default: 123)

Jun 02 '24 21:06 kschwab

@kschwab Thanks for your update.

I am going to merge the PR if you are done with your updates.

It will be released soon 🚀

Jun 03 '24 10:06 hramezani

Thanks @kschwab for your great effort here ❤️

Jun 03 '24 14:06 hramezani

Thanks @hramezani! Glad to see it go in 👍🏾

Jun 04 '24 07:06 kschwab

pydantic-settings pydantic-settings copied to clipboard

Add CLI Settings Source

Current Examples

Fields

Lists

Dictionaries

Subcommands and Positionals

Codecov Report

pydantic-settings
pydantic-settings copied to clipboard