pydantic-core icon indicating copy to clipboard operation
pydantic-core copied to clipboard

Take a `mode` parameter in `validate_python`

Open DanielRosenwasser opened this issue 11 months ago • 4 comments

This PR introduces a mode parameter in validate_python so that developers can get similar affordances to validate_json when strict=True.

Currently, validating dict objects against other kinds of models fails when strict is turned on, since the validation logic expects specific instances of a class. One could instead just lean on validate_json; however, if data has already been hydrated into Python dictionary objects (usually because parsing has happened elsewhere, sometimes in a format other than JSON entirely), then you're out of luck with some questionable solutions.

mode is meant to be either the string "python" or "json", defaulting to "python". Other values are currently rejected.

Fixes #712.

DanielRosenwasser avatar Mar 12 '24 23:03 DanielRosenwasser

Codecov Report

Merging #1226 (3acf358) into main (ab503cb) will decrease coverage by 0.34%. Report is 34 commits behind head on main. The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1226      +/-   ##
==========================================
- Coverage   90.21%   89.88%   -0.34%     
==========================================
  Files         106      106              
  Lines       16339    16663     +324     
  Branches       36       40       +4     
==========================================
+ Hits        14740    14977     +237     
- Misses       1592     1667      +75     
- Partials        7       19      +12     
Files Coverage Δ
src/url.rs 98.31% <100.00%> (ø)
src/validators/mod.rs 96.08% <100.00%> (+0.05%) :arrow_up:

... and 12 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 8db3a6f...3acf358. Read the comment docs.

codecov[bot] avatar Mar 13 '24 00:03 codecov[bot]

CodSpeed Performance Report

Merging #1226 will not alter performance

Comparing DanielRosenwasser:json_mode (3acf358) with main (8db3a6f)

Summary

✅ 148 untouched benchmarks

codspeed-hq[bot] avatar Mar 13 '24 00:03 codspeed-hq[bot]

Hm, I'm not sure what exactly has gone wrong with test-pydantic-integration. This is my first time contributing to the library, so I'm open to pointers (or being told I'm way way way off 😄).

DanielRosenwasser avatar Mar 13 '24 00:03 DanielRosenwasser

Welcome, and thanks for the PR! The integration test failures look a bit baffling to me too, to be honest. Would need to dig to figure that out.

There's some history here in #717 / #718. I think the problem is that the coupling of python / json mode isn't really just about this runtime parameter in the state but more the Input trait implementations offer very different code paths.

I would love to add this feature but I think that the minimal change here is sadly not sufficient. We'd need to refactor quite aggressively to let python-like inputs work in JSON mode.

davidhewitt avatar Mar 14 '24 11:03 davidhewitt

Ah, okay - #712 was left open and I didn't see those PRs. CCing @adriangb @dmontagu

I vaguely have an idea of the issues with the Input trait - namely that one deals with JsonValues and the other with Python objects, and so the code paths are concerned with entirely different kinds of values.

Given that... do you have any idea why the added test case

foo = v.validate_python({'a': 'hello', 'b': True}, strict=True, mode='json')

successfully validates? Or perhaps a better question - what doesn't work with the current PR?

DanielRosenwasser avatar Mar 14 '24 18:03 DanielRosenwasser

Welcome @DanielRosenwasser and thanks for the PR, kudos for trying.

Or perhaps a better question - what doesn't work with the current PR?

The problem is we have specific logic in

https://github.com/pydantic/pydantic-core/blob/352d40f6928498856ab119f1df57cbb2b5df1cb7/src/input/input_json.rs#L40

To deal with the fact that JSON only has a small number of primitive types. The idea is that (for example) a JSON Array is a valid input to a tuple field even in strict mode, while a Python list would not be valid.

Anywhere where we use the rust Input implementation to aid in validation of non-JSON types would (I guess) not work with your PR. So the following cases would pass with JSON input in strict mode, pass with Python input in lax mode, but fail with a Python input in strict mode, even if we set mode='json':

  • string input to a date, datetime, timedelta, time, bytes, Path
  • list/array input to a tuple, set, frozenset, deque, generator
  • int input to a Decimal
  • float input to a Decimal
  • dict input to a dataclass

Quite possibly more.

Personally I don't see a way to work around this without introducing another Input trait implementation for PyAny (not sure how that would work) or introducing lots more logic to the existing implementation.

The best solution AFAIK is to dump your input to JSON, then pass that to model_validate_json(), in most cases that will be pretty fast.

samuelcolvin avatar Mar 18 '24 16:03 samuelcolvin

Thanks again for your work @DanielRosenwasser, closing as per my comments above.

samuelcolvin avatar Mar 27 '24 15:03 samuelcolvin