hera
hera copied to clipboard
Pydantic should default to v2, or hera should allow it to be configured
Pre-bug-report checklist
1. This bug can be reproduced using pure Argo YAML
- [ ] Yes 👉 Please report a bug to the Argo Workflows GitHub 👈
- [ ] No
- [x] n/a (it is an issue with how hera imports pydantic)
If yes, it is more likely to be an Argo bug unrelated to Hera. Please double check before submitting an issue to Hera.
2. This bug occurs in Hera when...
- [ ] exporting to YAML
- [ ] submitting to Argo
- [ ] running on Argo with the Hera runner
- [x] other: when using hera's pydantic models
Bug report
Describe the bug A clear and concise description of what the bug is: When mixing in hera models into my own pydantic models, I get errors like this:
return cls.model_validate(kwargs, strict=True)
E TypeError: BaseModel.validate() takes 2 positional arguments but 3 were given
Because I'm using the default in pydantic (V2 models), but hera defaults to the v1 models:
I think this is a bug b/c _PYDANTIC_VERSION gets set to 2 even though the imports are v1.
Error log if applicable:
To Reproduce Full Hera code to reproduce the bug:
python -c 'from hera.workflows.models import Volume; print(Volume.__mro__)'
(<class 'hera.workflows.models.io.k8s.api.core.v1.Volume'>, <class 'hera.shared._pydantic.BaseModel'>, <class 'pydantic.v1.main.BaseModel'>, <class 'pydantic.v1.utils.Representation'>, <class 'object'>)
Expected behavior A clear and concise description of what you expected to happen:
I think this should default to import pydantic.v2 first, but given that both versions are supported according to comments in _pydantic.py, I think this could also be a feature request in that there should be some variable like HERA_PYDANTIC_API that defaults to v2, but lets others choose v1 if they're still stuck on that version.
Environment
- Hera Version: 5.14.0
- Python Version: 3.X.X
- Argo Version: 3.X.X
Additional context Add any other context about the problem here.
Hi @terrykong, thanks for adding the issue. The original PR #795 to add V2 has some history.
The problem is specifically that Hera's codebase was written using Pydantic V1, including the generated Argo model classes. This would require a fairly large time investment to provide models for both versions, and increase the maintenance burden for the two codepaths throughout the codebase (@samj1912 explored this initially but decided against it). Long term we do plan to migrate the codebase to V2, but it should not affect end users at this time.
I'm curious why you would need to mix Hera models with your own? As Hera builds workflows into yaml I'm not sure what missing feature your own V2 "mixin" models would be providing?
Also, the script runner feature lets you use Pydantic V1 or V2 models within your script template functions.
Hi @elliotgunton
So in my use case, I was authoring configuration that used pydantic's BaseModel, but tried to add in hera types as fields, and ran into this issue. So something like this:
from hera.workflows.models import Volume
from pydantic import BaseModel
class MyConfig(BaseModel):
foobar: str
volume: Volume
MyConfig.model_validate({"foobar":"a", "volume": {"host_path": {"path": "/mnt"}}})
# Gives
'''
File "/.../venv/lib/python3.10/site-packages/pydantic/main.py", line 509, in model_validate
return cls.__pydantic_validator__.validate_python(
'''
class H(BaseModel):
path: str
class V(BaseModel):
host_path: H
class MyConfigWhichDoesWork(BaseModel):
foobar:str
volume: V
MyConfigWhichDoesWork.model_validate({"foobar":"a", "volume": {"host_path": {"path": "/mnt"}}})
# And this works
So basically I want to use pydantic with hera's types but I think the v1 v2 API mismatch is causing pydantic's cryptic error message.
Thank you @terrykong! Using the Hera classes as part of your configs makes sense 👍 The point remains that it will take a significant effort to allow both v1 and v2 classes to be exported from the library, so I can't give an expected timeframe/exact details for the time being.
Seeing a potentially related error
hera generate yaml --to generated/ -e __pycache__/* --recursive _hera/
warning: The `hera` CLI is a work-in-progress, subject to change at any time!
Traceback (most recent call last):
File "/Users/devgoyal/.pyenv/versions/argo/bin/hera", line 8, in <module>
sys.exit(main())
^^^^^^
File "/Users/devgoyal/.pyenv/versions/3.11.7/envs/argo/lib/python3.11/site-packages/hera/__main__.py", line 26, in main
cappa.invoke(Hera, argv=argv)
File "/Users/devgoyal/.pyenv/versions/3.11.7/envs/argo/lib/python3.11/site-packages/cappa/base.py", line 141, in invoke
with resolved.get(concrete_output) as value:
File "/Users/devgoyal/.pyenv/versions/3.11.7/lib/python3.11/contextlib.py", line 137, in __enter__
return next(self.gen)
^^^^^^^^^^^^^^
File "/Users/devgoyal/.pyenv/versions/3.11.7/envs/argo/lib/python3.11/site-packages/cappa/invoke.py", line 65, in get
result = callable(**finalized_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/devgoyal/.pyenv/versions/3.11.7/envs/argo/lib/python3.11/site-packages/hera/_cli/generate/yaml.py", line 35, in generate_yaml
for workflow in load_workflows_from_module(path):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/devgoyal/.pyenv/versions/3.11.7/envs/argo/lib/python3.11/site-packages/hera/_cli/generate/yaml.py", line 123, in load_workflows_from_module
spec.loader.exec_module(module)
File "<frozen importlib._bootstrap_external>", line 940, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/Users/devgoyal/Development/argo/ml-features/dbt-features/_hera/dating_outcomes/rating_matches_daily_cron.py", line 84, in <module>
"success": LifecycleHook(
^^^^^^^^^^^^^^
File "/Users/devgoyal/.pyenv/versions/3.11.7/envs/argo/lib/python3.11/site-packages/pydantic/v1/main.py", line 341, in __init__
raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for LifecycleHook
arguments
value is not a valid dict (type=type_error.dict)
make: *** [Makefile:11: hera-generate] Error 1
From this code
Step(
name="post-processing-step",
template_ref=matches_etl_ref,
arguments=[
Parameter(name="actions", value=["matches"]),
],
hooks={
"success": LifecycleHook(
arguments=[Parameter(name="metro", value="matches_etl")],
expression="steps['post-process-step']=='Succeeded'",
template_ref=metrics_ref,
)
},
)
Hey @dev-goyal - that error is unrelated as it's because LifecycleHook isn't yet a special Hera class. The error
arguments
value is not a valid dict (type=type_error.dict)
is because you need to use from hera.workflows.models import Arguments to rewrite it as
...
arguments=Arguments(
parameters=[
Parameter(name="metro", value="matches_etl"),
],
),
...
or as a dict without the import
...
arguments={
"parameters": [
Parameter(name="metro", value="matches_etl"),
],
},
...
Hey @dev-goyal - that error is unrelated as it's because
LifecycleHookisn't yet a special Hera class. The errorarguments value is not a valid dict (type=type_error.dict)is because you need to use
from hera.workflows.models import Argumentsto rewrite it as... arguments=Arguments( parameters=[ Parameter(name="metro", value="matches_etl"), ], ), ...or as a dict without the import
... arguments={ "parameters": [ Parameter(name="metro", value="matches_etl"), ], }, ...
Sorry for the noise, that is correct!