cyclopts
cyclopts copied to clipboard
[Feature request]: Nested pydantic validation
Adding support for nested pydantic model.
For instance:
from pydantic import BaseModel
class FuncArgs(BaseModel):
a: int
b: str
def f(input: FuncArgs):
...
class MainInputs(BaseModel):
f_arg: FuncArgs
name: str
@app.default
def main(inputs: MainInputs):
f(inputs.f_args)
...
Not sure how the nested config would look like from the cli perspective but here is a proposition
python cli.py --f-arg--a 1 --f-args--b 1
python cli.py --f-arg.a 1 --f-args.b 1
This is a feature that is available in hydra/OmegaCong that would be really useful to have here !
Thanks in advance
So this feature would probably be a bit complicated to implement, but it might not. Let's brainstorm. Here are some thoughts.
- Cyclopts can explicitly check if the type hint is a pydantic BaseModel:
issubclass(MainInputs, BaseModel) - Pydantic internally has pretty good coercion logic, so Cyclopts can really provide all values as strings:
>>> MainInputs(f_arg=dict(a="5", b="foo"), name="bar") MainInputs(f_arg=FuncArgs(a=5, b='foo'), name='bar') - Of the proposed CLI syntaxes, I think the second one makes more sense:
But it would actually have to be something like:python cli.py --f-arg.a 1 --f-args.b 1python cli.py --inputs.f-arg.a 1 --inputs.f-args.b 1 - Currently, Cyclopts does not support dictionary type hints (
**kwargsis a special case). - We could add dictionary support using the dot-notation mentioned in (3), independent of Pydantic stuff. Parameters annotated with a dictionary-like type-hint MUST be keyword-only; so your default handler would become:
@app.default def main(*, inputs: MainInputs): f(inputs.f_args) ... - What would the help-page look like? Would Cyclopts recursively traverse the pydantic models? Would the docstring of
MainInputs.f_argbe displayed anywhere?
╭─ Parameters ───────────────────────────────────────────────────────╮
│ * --inputs.f-arg.a Docstring from pydantic model. [required] │
│ * --inputs.f-arg.b Docstring from pydantic model. [required] │
│ * --inputs.name Docstring from pydantic model. [required] │
╰────────────────────────────────────────────────────────────────────╯
- If annotated:
inputs: Annotated[MainInputs, Parameter(...)], what doesParameterapply to? The whole model? Do we ignore some fields likehelp=?
Thanks for your answers !
But it would actually have to be something like:
python cli.py --inputs.f-arg.a 1 --inputs.f-args.b 1
I am not sure to understand why it would have to be like this. But gut feeling is that it would be a bit counter intuitive to have to say inputs each tim.
- We could add dictionary support using the dot-notation mentioned in (3), independent of Pydantic stuff. Parameters annotated with a dictionary-like type-hint MUST be keyword-only; so your default handler would become:
@app.default def main(*, inputs: MainInputs): f(inputs.f_args) ...
I think this makes sense to only allow keyword only in this case.
- What would the help-page look like? Would Cyclopts recursively traverse the pydantic models? Would the docstring of
MainInputs.f_argbe displayed anywhere?╭─ Parameters ───────────────────────────────────────────────────────╮ │ * --inputs.f-arg.a Docstring from pydantic model. [required] │ │ * --inputs.f-arg.b Docstring from pydantic model. [required] │ │ * --inputs.name Docstring from pydantic model. [required] │ ╰────────────────────────────────────────────────────────────────────╯
If they are too many nested param it might be a bit too long to traverse the whole thing. But in most case it should be fine. Maybe you could do like up to 3 recursive level
- If annotated:
inputs: Annotated[MainInputs, Parameter(...)], what doesParameterapply to? The whole model? Do we ignore some fields likehelp=?
I would say that it apply to the whole block, otherwise if I wanted it to apply only to a nested field I would annotated this particular field
I am not sure to understand why it would have to be like this. But gut feeling is that it would be a bit counter intuitive to have to say inputs each time.
Basically, there's two things:
- What if you have 2 pydantic models as parameters?
- This now introduces an inconsistency (why should a pydantic's cli keywords not have the parameter-name)?
If they are too many nested param it might be a bi too long to traverse the whole thing. But in most case it should be fine. Maybe you could do like up to 3 recursive level
I wonder how Annotated[..., Parameter()] would interact with Pydantic; that may be a possible solution.
I think there might be a workable solution.
I wonder how Annotated[..., Parameter()] would interact with Pydantic; that may be a possible solution.
I think that pydantic will not care about what you put in the Annotated unless it is expose some private method. So Parameter should be fine.
yeah you are right my bad. In my head it would only be one Pydantic model.
My goal is to replace my code that use Hydra + OmegaCong with Cyclopts + Pydantic, mainly because I think that pydantic nailed dataclass validation and offer serialization for free. mall example, in this code I define my all config management with one pydantic class and would like just to map it naturally to a cli app.
But maybe it make more sense to just allow nested pydantic model for now.
So I am fine with
python cli.py --inputs.f-arg.a 1 --inputs.f-args.b 1
and to have nested documentation as well.
:smiley:
So I think the first step on this is to add dictionary support, which I can work on. This feature will certainly have a slower turn around time than other simpler features, but hopefully we make steady progress!
So I think the first step on this is to add dictionary support, which I can work on. This feature will certainly have a slower turn around time than other simpler features, but hopefully we make steady progress!
Thanks, no worries on time, happy to contribute if necessary tho not sure where to start
Hey @BrianPugh, I am still planning on using cyclopts for handling my config. It still the best cli tool that I found out there. But I am still limited by this :cry: Any chance you could point me to the direction where I could start implementing dictionary support ? (And later on nested pydantic class).
Happy to do integration myself :)
Best
I think this feature might be more complicated than we initially thought, but here are a few pointers.
- The primary CLI token steps are performed here. Basically we want to map
inspect.Parameterobjects to their string token(s). We do a little bit of a hack and sometimes associate non-string values to ainspect.Parameter, this is for "implicit value" parsing, such as boolean flags. This is compensated for here. - Part of the responsibility of _parse_kw_and_flags is to also parse
**kwargsif available. This is the closest thing Cyclopts has to dictionary parsing. - The main recursive string-token to actual-python-type logic is here. Note that this only takes in the type hint and doesn't know about the
inspect.Parameter.
At a high level, I think you would have to:
- add
.-splitting logic to _parse_kw_and_flags. Make sure the associated annotated type is a dictionary. I haven't fully thought it through, but maybe add something like the following:
cli_key_tokens = cli_key.split(".")
iparam, implicit_value = cli2kw[cli_key_tokens[0]]
if len(cli_key_tokens) > 1:
assert iparam.annotation in (dict, pydantic.BaseModel) # but more robustly with get_origin and stuff.
d = mappings
d.setdefault(iparam, {})
for k in cli_key_tokens[1:-1]:
d.setdefault[k, {}]
d = d[k]
d[cli_key_tokens[-1]] = cli_values
-
Maybe add additional logic here to recursively convert each value if it's a dict. We should handle TypedDict annotations, as well.
-
In addition to (2), this might be the correct location to coerce it into the Pydantic class type.
Note: throughout the code I use iparam for an inspect.Parameter, and I use cparam for a cyclopts.Parameter.
thanks for the guideline @BrianPugh, really helpfull ! I will start to dig into it.
Hello @samsja ! Did you have the time to start implementing this?
Hello @samsja ! Did you have the time to start implementing this?
unfortunatelty no, I did a small PoC on how it would look like here: https://github.com/samsja/pydantic_cli tho it is not using cyclopts yet
Hello @samsja ! Did you have the time to start implementing this?
unfortunatelty no, I did a small PoC on how it would look like here: https://github.com/samsja/pydantic_cli tho it is not using cyclopts yet
Pretty neat, thanks for sharing! We ended up using Hydra, but I'll keep an eye on your package for the future.
I certainly think Hydra is a good option for pretty complex configurations (e.g. for neural networks). I'll try working on this after other, higher priority features are complete (e.g. the config-file stuff in #165, which may actually help @dbuades , and CLI completions).
Heads up: I started working on dictionary-support in the dict-support branch. It's not yet in a working state, but giving a heads up so hopefully we don't duplicate efforts. It won't immediately support pydantic, but it's a good stepping stone for a subsequent PR. The current intention is to support dict and TypedDict annotations.
Update on this: the dict-support branch now supports parsing:
dictTypedDictnamedtuple/NamedTupledataclasspydanticattrs
However, none of the help and config related features have been implemented. Also, I think I'll need to fundamentally reimplement some of this stuff as the current implementation doesn't respect Parameter annotations of keys inside these dicts/objects. However, the current implementation exercise has been helpful.
These will eventually make it's way into a v3.0.0 release when I'm more happy with the functionality/implementation.
amazing !
@BrianPugh
I've been working on a library for automating LLM function/tool calling, where it parses functions/TypedDict/NamedTuple/Pydantic model and can invoke them from their raw value (for typeddict, namedtuple and pydantic model specifically we need dictionary objects as input).
I don't know how far you've implemented as I can't access dict-support branch but this might be helpful: https://github.com/synacktraa/tool-parse/blob/master/tool_parse/compile.py
This project is not as complex as a CLI parsing library but the basic logic is almost same for parsing and invoking the objects.
I'm still changing a bunch of things, but you can take a look/try the dev-3.0.0 branch which has the changes. It's not heavily tested yet, and I'm still going to change some things, but it should be working.
You can see an example unit test using pydantic here. Timeline for v3.0.0 is slipping a lil bit, but I'm still consistently working on it. Just a lot of late summer/early fall events in my life 🙈 .
v3.0.0 has been released! Please let me know if you run into any issues!