pydantic-settings
pydantic-settings copied to clipboard
Add CLI Settings Source
Resolves #209.
This PR implements a CliSettingsSource
with the following main features outlined below. Refer to docs/index.md for more details.
Current Examples
Fields
- Same expectations as setting environment variables
- Proposed
_cli_parse_args
can accept arg array orbool
set toTrue
forsys.argv
from pydantic import BaseModel
from pydantic_settings import BaseSettings
class DeepSubModel(BaseModel):
v4: str
class SubModel(BaseModel):
v1: str
v2: bytes
v3: int
deep: DeepSubModel
class Settings(BaseSettings):
v0: str
sub_model: SubModel
print(Settings(_cli_parse_args=[
'--v0=0',
'--sub_model={"v1": "json-1", "v2": "json-2"}',
'--sub_model.v2', 'nested-2',
'--sub_model.v3', '3',
'--sub_model.deep.v4', 'v4'
]).model_dump())
'''
{'v0': '0', 'sub_model': {'v1': 'json-1', 'v2': b'nested-2', 'v3': 3, 'deep': {'v4': 'v4'}}}
'''
Lists
- Proposed intermixing of three list styles:
- JSON style
--field='[1,2,3]'
- Argparse style
--field 1 --field 2 --field 3
- Lazy style
--field=1,2,3
- JSON style
from pydantic import BaseModel
from typing import List
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
my_list: List[int]
print(Settings(_cli_parse_args=[
'--my_list', '[1,2]',
'--my_list', '3,4',
'--my_list', '5', '--my_list', '6'
]).model_dump())
'''
{'my_list': [1, 2, 3, 4, 5, 6]}
'''
Dictionaries
- Proposed intermixing of two dictionary styles:
- JSON style
--field='{"k1": 1, "k2": 2}'
- Env Var style
--field k1=1 --field k2=2
- JSON style
- These styles can be used in conjunction with list forms as well:
-
--field k1=1,k2=2 --field k3=3 --field '{"k4: 4}'
etc.
-
from pydantic import BaseModel
from typing import Dict
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
my_dict: Dict[str, str]
print(Settings(_cli_parse_args=[
'--my_dict', '{"k1": "a", "k2": "b"}',
'--my_dict', '{"k3": "c"}, {"k4": "d"}',
'--my_dict', '{"k5": "e"}', '--my_dict', '{"k6": "f"}',
'--my_dict', '[k7=g,k8=h]',
'--my_dict', 'k9=i,k10=j',
'--my_dict', 'k11=k', '--my_dict', 'k12=l',
]).model_dump())
'''
{'my_dict': {
'k1': 'a',
'k2': 'b',
'k3': 'c',
'k4': 'd',
'k5': 'e',
'k6': 'f',
'k7': 'g',
'k8': 'h',
'k9': 'i',
'k10': 'j',
'k11': 'k',
'k12': 'l',
}}
'''
Subcommands and Positionals
- Proposed help text generation based off of field descriptions and class documentation
- Proposed
_cli_prog_name
has same meaning as in argparse - Proposed
CliSubCommand
annotation for marking subcommands - Proposed
CliPositionalArg
annotation for marking positional arguments
from pydantic import BaseModel
from typing import Tuple, Type
from pydantic_settings import BaseSettings, CliSubCommand, CliPositionalArg, PydanticBaseSettingsSource
class FooPlugin(BaseModel, use_attribute_docstrings=True):
'''git-plugins-foo - Extra deep foo plugin command'''
my_feature: bool = False
'''Enable my feature on foo plugin'''
class BarPlugin(BaseModel, use_attribute_docstrings=True):
'''git-plugins-bar - Extra deep bar plugin command'''
my_feature: bool = False
'''Enable my feature on bar plugin'''
class Plugins(BaseModel, use_attribute_docstrings=True):
'''git-plugins - Fake plugins for GIT'''
foo: CliSubCommand[FooPlugin]
'''Foo is fake plugin'''
bar: CliSubCommand[BarPlugin]
'''Bar is also a fake plugin'''
class Clone(BaseModel, use_attribute_docstrings=True):
'''git-clone - Clone a repository into a new directory'''
repository: CliPositionalArg[str]
'''The repository to clone'''
directory: CliPositionalArg[str]
'''The directory to clone into'''
local: bool = False
'''When the resposity to clone from is on a local machine, bypass ...'''
shared: bool = False
'''Force the clone process from a reposity on a local filesystem ...'''
class Init(BaseModel, use_attribute_docstrings=True):
'''git-init - Create an empty Git repository or reinitialize an existing one'''
directory: CliPositionalArg[str]
'''The directory to clone into'''
quiet: bool = False
'''Only print error and warning messages; all other input will be suppressed'''
bare: bool = False
'''Create a bare repository. If GIT_DIR environment is not set, it is set ...'''
class Git(BaseSettings, use_attribute_docstrings=True):
'''git - The stupid content tracker'''
clone: CliSubCommand[Clone]
'''Clone a repository into a new directory'''
init: CliSubCommand[Init]
'''Create an empty Git repository or reinitialize an existing one'''
plugins: CliSubCommand[Plugins]
'''Fake GIT plugin commands'''
print(Git(_cli_prog_name='git', _cli_parse_args=['--help']))
'''
usage: git [-h] {clone,init,plugins} ...
git - The stupid content tracker
options:
-h, --help show this help message and exit
subcommands:
{clone,init,plugins}
clone Clone a repository into a new directory
init Create an empty Git repository or reinitialize an existing one
plugins Fake GIT plugin commands
'''
print(Git(_cli_prog_name='git', _cli_parse_args=['clone', '--help']))
'''
usage: git clone [-h] [--local bool] [--shared bool] REPOSITORY DIRECTORY
git-clone - Clone a repository into a new directory
positional arguments:
REPOSITORY The repository to clone
DIRECTORY The directory to clone into
options:
-h, --help show this help message and exit
--local bool When the resposity to clone from is on a local machine, bypass ...
--shared bool Force the clone process from a reposity on a local filesystem ...
'''
print(Git(_cli_prog_name='git', _cli_parse_args=['plugins', 'bar', '--help']))
'''
usage: git plugins bar [-h] [--my_feature bool]
git-plugins-bar - Extra deep bar plugin command
options:
-h, --help show this help message and exit
--my_feature bool Enable my feature on bar plugin
'''
@hramezani and @samuelcolvin
I am most of the way through but hit this limitation in the underlying environment variable parsing:
JSON is only parsed in top-level fields, if you need to parse JSON in sub-models, you will need to implement validators on those models.
Obviously, this would be very nice to have from a CLI perspective (e.g., nested lists etc.). I didn't know this was the case until I started writing nested tests. Do you have suggestions on what the best course of action is here? Ideally, I would love for it to be handled in environment parsing. If not, from CLI perspective the next easiest path would be nested dictionaries of strings. Is that something I can throw to Pydantic and have a reasonable expectation it will sort it out?
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Comparison is base (
40230ab
) 97.69% compared to head (cb9c1c3
) 98.65%. Report is 5 commits behind head on main.
:exclamation: Current head cb9c1c3 differs from pull request most recent head a984f32. Consider uploading reports for the commit a984f32 to get more accurate results
Additional details and impacted files
@@ Coverage Diff @@
## main #214 +/- ##
==========================================
+ Coverage 97.69% 98.65% +0.96%
==========================================
Files 5 5
Lines 347 597 +250
Branches 76 152 +76
==========================================
+ Hits 339 589 +250
Misses 6 6
Partials 2 2
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thanks @kschwab for this patch 🙏
Regarding JSON parsing issue please take the easiest path.
Also, you need to add documentation for this and show how to use this new feature. I can see you've added different configs, please add some examples for them in the doc to show what is the use case of them.
Sorry @hramezani, I should have been more specific with my question. I am at a fork and am looking for guidance on which direction to take.
Is there any background on why nested JSON isn't supported in EnvSettingsSource
? Was there something that scared us away from adding it? Is it a massive pile of work? etc.
My preference would be to solve it there so that the CLI and ENV are consistent. Just trying to ascertain if that is a reasonable direction or not.
I think the comment that you brought from the doc is an old one and related to the first time that nested env support added to Pydantic V1.
Here is my understanding from the comment: if you have a settings class like
class SubValue(BaseModel):
v4: str
v5: int
class TopValue(BaseModel):
v1: str
v2: str
v3: str
sub: SubValue
class Cfg(BaseSettings):
v0: str
top: TopValue
model_config = SettingsConfigDict(env_nested_delimiter='__')
you only can provide JSON values for Cfg.top
. e.g.
V0=test TOP='{"v1": "json-1", "v2": "nested-2", "v3": "3", "sub": {"v4": "v4", "v5": 5}}'
and it is not possible to provide JSON values for sub fields like:
V0=test TOP='{"v1": "json-1", "v2": "nested-2", "v3": "3"}' TOP__SUB='{"v4": "v4", "v5": 5}'
But, you can see that passing JSON values for sub fields also works. So, Probably we can remove it.
Got it. So it's either a bug or something invalid in the test case. Below is an example of the issue I am seeing:
import os
from pydantic import BaseModel
from typing import Optional, List
from pydantic_settings import BaseSettings
class Child(BaseModel):
num_list: Optional[List[int]] = None
class Cfg(BaseSettings, env_nested_delimiter='__'):
child: Optional[Child] = None
os.environ['CHILD__NUM_LIST'] = '[1,2,3]'
print(Cfg().model_dump())
'''
Traceback (most recent call last):
File "/home/kschwab/pydantic-settings/example.py", line 23, in <module>
print(Cfg().model_dump())
File "/home/kschwab/pydantic-settings/pydantic_settings/main.py", line 93, in __init__
super().__init__(
File "/home/kschwab/.local/lib/python3.10/site-packages/pydantic/main.py", line 171, in __init__
self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Cfg
child.num_list
Input should be a valid list [type=list_type, input_value='[1,2,3]', input_type=str]
For further information visit https://errors.pydantic.dev/2.6/v/list_type
'''
If you note the input_value='[1,2,3]'
in the validation error it got the correct input, it just did not convert it into a list. This is what led me to believe that nested JSON must not be supported.
It's a bug. Here is the related issue
Well, not great that it's a bug, but good in the context of this PR since it allows us to keep using EnvSettingsSource
as the backbone. We'll have to address that issue first before this PR goes in.
I'll continue on with rest of documentation, cleanup, etc. and loop back to it if it's still there.
The bulk of the functionality is now complete. Will continue to work on cleanup, comments, docs etc. over next few days. The only key remaining feedback items needed for any adjustments are:
- Subcommand and Positional Argument Implementation (refer to GIT CLI example for reference)
- [ ] Does the
CliSubCommand
andCliPositionalArg
annotations for denoting their respective meanings look OK?- There are a few ways to accomplish this. I took the stance of a more explicit annotation at the user level for denoting subcommands and positional arguments. Overall, it is quite clean and makes intent clear when reading the model. The alternative approach is implicit coercion, with union fields into subcommands and required fields into positionals. This takes flexibility away from user, who may want a similar experience to that as environment variable overrides. Therefore, the default experience is in line with how Pydantic is already structured, and deviation from the expectation is expressed through these annotations.
- [ ] Is single subparser per model OK?
- I took an opinionated stance of only supporting a single subcommand group (single subparser) per model. If we want to support multiple subparsers, we would likely need to go back to an earlier version I had here. I changed from that direction however because:
- Better control of generated help text. With a union implementation, the help text for the subcommand gets pulled from the class doc, which works so long as it is not long. However, in our application, we work with larger models that do have verbose class documentation so it can flood the help text. Simple workaround would be parsing the first sentence of the class doc etc. but not worth it IMO. With current implementation, the subcommand help text is the field doc string making it very straightforward. Then, the class doc itself is displayed when using help on the subcommand itself. e.g.
git --help
shows condensed help for theclone
subcommand, whereasgit clone --help
shows the expanded class doc help text there. - Explicit naming of subcommands. In union implementation, the subcommand name must be coerced from the class name or some other mechanism. With current implementation, it is whatever the subcommand field name is.
- Avoids the need for Discriminated Unions. Absolutely an option but requires more boilerplate, which takes away from the clean aesthetic. This was actually the forcing issue that caused me to change direction. The normal union selection was not strong enough so users would have needed to add discriminated unions for any subcommand.
- Better control of generated help text. With a union implementation, the help text for the subcommand gets pulled from the class doc, which works so long as it is not long. However, in our application, we work with larger models that do have verbose class documentation so it can flood the help text. Simple workaround would be parsing the first sentence of the class doc etc. but not worth it IMO. With current implementation, the subcommand help text is the field doc string making it very straightforward. Then, the class doc itself is displayed when using help on the subcommand itself. e.g.
- I took an opinionated stance of only supporting a single subcommand group (single subparser) per model. If we want to support multiple subparsers, we would likely need to go back to an earlier version I had here. I changed from that direction however because:
- [ ] Positional arguments always placed before subcommands?
- Another opinionated stance, although of smaller concern. As is today, there is no requirement with regards to field ordering within the definition of a model. However, the resultant CLI will reorder positional args before subcommands. This is mostly to prevent users from shooting themselves in the foot. Perhaps this is best to just change to a flag, i.e., turn the safety off?
- [ ] Does the
Outside of the above, I think everything else is pretty straightforward and aligns with precedence already set by environment variables.
Hi @hramezani, need some help with resolving the below:
- Many of the doc examples use CLI
--help
to view generated help text etc. However, this results in an exit, which pytest does not like. How to update pytest such that exit is allowed in these expected cases? - Many of the doc examples use the recently merged
use_attribute_docstrings
flag, which does not appear available in tests. How to update the tests such that the pydantic version includes this commit? - Is there a way to generate the docs locally for viewing?
Many of the doc examples use CLI --help to view generated help text etc. However, this results in an exit, which pytest does not like. How to update pytest such that exit is allowed in these expected cases?
You can use
```py test="skip"
to skip running on those blocks
Many of the doc examples use the https://github.com/pydantic/pydantic/pull/6563 use_attribute_docstrings flag, which does not appear available in tests. How to update the tests such that the pydantic version includes this commit?
This will be available on Pydantic 2.6
. It is in beta and will be released soon. We are planing to release the next pydantic-settings
after Pydantic 2.6
. So, in the end, we can use use_attribute_docstrings
flag.
Is there a way to generate the docs locally for viewing?
Not right now. pydantic-settings
docs will be rendered under pydantic docs. But I will work on it to make it possible to render docs in pydantic-settings
Hi @hramezani, almost ready to move this into review. There are three final items:
- I do not plan to add coverage for this line, unless there is a safe/easy way in pytest to modify
sys.argv
. Is this acceptable? If so, how to exclude? - I would like to exclude this function from coverage or leverage the original test case if you can help locate. I pulled this from pydantic._internal._repr.py. Please provide guidance on desired direction.
- Also related to above number two, I am hitting a
mypy
linting issue related totyping_extensions.TypeAliasType
here. This is directly copied from the_repr
version, how would you like me to proceed?
I do not plan to add coverage for this line, unless there is a safe/easy way in pytest to modify sys.argv. Is this acceptable? If so, how to exclude?
Can't we use monkeypatch
fixture like monkeypatch.setattr("sys.argv", ["test1", "-f", "test2"])
?
I would like to exclude this function from coverage or leverage the original test case if you can help locate. I pulled this from pydantic._internal._repr.py. Please provide guidance on desired direction.
There are some tests in pydantic repo
- https://github.com/pydantic/pydantic/blob/c1dff153c8ba2894e2b060cf6e53de6851ba4cd9/tests/test_utils.py#L83
- https://github.com/pydantic/pydantic/blob/c1dff153c8ba2894e2b060cf6e53de6851ba4cd9/tests/test_utils.py#L106
Also related to above number two, I am hitting a mypy linting issue related to typing_extensions.TypeAliasType here. This is directly copied from the _repr version, how would you like me to proceed?
Please ignore it
Thanks @hramezani for the feedback. Ready for review 👍🏾
please update
@hramezani @samuelcolvin Thank you for the feedback, really appreciate it and will make updates accordingly. We found a couple of changes we'll also bring in after playing with it internally:
- The enum support should be moved deeper into the environment variable parsing. Will open a separate issue and address it there.
- We plan to integrate this with other existing parsers (one of which is pytest), so will be providing some support to "export" the internally generated parser and likely a pytest specific example. Still working on how to best accomplish this but will push once ready.
- The enum support should be moved deeper into the environment variable parsing. Will open a separate issue and address it there.
Opened #252 to address this.
@hramezani @samuelcolvin @dmontagu updates are complete and ready for review.
@hramezani @samuelcolvin @dmontagu updates are complete and ready for review.
Thanks @kschwab for the update. We will review it later and probably this will be included in the next release
@kschwab Sorry for the late response. Finally, I found time to work on this and test it. The implementation is good and the doc is very well 👍
There are something that need to be fixed:
-
If you enable
SettingsConfigDict(cli_parse_args=True)
, then the CLI source will be the first priority and there is no way to change the priority. This is not like other sources that we've added before liketoml
andyaml
. you include them and change the priority as well. So, this needs to be changed and we should have the same behavior as other new sources. -
It doesn't work with aliases. with the following model:
class Settings(BaseSettings):
apple: str = Field(alias='alias')
model_config = SettingsConfigDict(cli_parse_args=True)
we can't use --alias foo
.
As alias support is included in other sources, we have to have it in CLI as well.
- It doesn't work with case_sensitive config and always expects the field name. With the following model:
class Settings(BaseSettings):
foo: str
model_config = SettingsConfigDict(cli_parse_args=True, case_sensitive=False) # default is `False`
Then --FOO bar
doesn't work. in env source, we convert all the env names to lowercase in when case_sensitive=False
- It doesn't work well with dataclasses in sub model. for example with:
@pydantic_dataclasses.dataclass
class MyDataclass:
foo: int
class Settings(BaseSettings):
n: MyDataclass
You have to pass --n '{"foo": 123} but --n.foo 123 doesn't work. but it works for normal basemodel submodels
Thanks for your great work here ❤️
Thanks @hramezani for the review, we'll address the feedback items soon.
With respect to the CliSettingsSouce
prioritization, originally, we had prioritization that could be changed but removed it after our discussion here. The conclusion was to use fixed prioritization so as to not break backwards compatibility, and then revisit dynamic prioritization once this issue is fixed in V3.
We can change it back to the original if you would prefer, let me know your thoughts.
Thanks @hramezani for the review, we'll address the feedback items soon.
With respect to the
CliSettingsSouce
prioritization, originally, we had prioritization that could be changed but removed it after our discussion here. The conclusion was to use fixed prioritization so as to not break backwards compatibility, and then revisit dynamic prioritization once this issue is fixed in V3.We can change it back to the original if you would prefer, let me know your thoughts.
Yeah. that was my first thought. but after talking with the team we decided to have the CLI source like other sources. Can't we implement the new CLI source like other new sources(toml, yaml) and include it like them? it would be good to Sorry about that.
No worries!
Can't we implement the new CLI source like other new sources(toml, yaml) and include it like them?
I haven't looked at the latest toml, yaml source changes and how it was integrated, but will try and align with it if possible. I don't think there should be any issues. The only thing that comes to mind is the external parser support, but I think it should be fine. We'll make the updates and see where we land 🙂
@hramezani I've addressed the issues raised. For case-insensitive matching, this is not supported on external parsers, it is only supported on the internal parser.
@hramezani I've addressed the issues raised. For case-insensitive matching, this is not supported on external parsers, it is only supported on the internal parser.
Thanks @kschwab for addressing the issues. I will check it later.
Thanks @samuelcolvin for the review.
I added the differentiation with the last commit. Required fields are marked as (required)
and optional fields are marked as (default: val)
. This also aligns with argparse ArgumentDefaultsHelpFormatter in how it delineates optional/default arguments.
usage: demo.py [-h] [--foo str] [--bar int]
Settings for the application.
options:
-h, --help show this help message and exit
--foo str Set the foo value. (required)
--bar int (default: 123)
@kschwab Thanks for your update.
I am going to merge the PR if you are done with your updates.
It will be released soon 🚀
Thanks @kschwab for your great effort here ❤️
Thanks @hramezani! Glad to see it go in 👍🏾