schema icon indicating copy to clipboard operation
schema copied to clipboard

Full refactoring of Schema

Open aure-olli opened this issue 6 years ago • 4 comments
trafficstars

Hello

I really like the way Schema validates data, but I felt limited by some features, like the pretty limited Hook, JSON schema support, the fact that Schema systematically tries every key of a dict on every schema, and that most work is done at each validation instead of on initialization.

So I eventually rewrote it: https://github.com/aure-olli/schema

I staid as compatible as possible the previous version, and match (almost) all the previous tests, while adding a big bunch of new one.

For the minor incompatibilities:

  • I removed the support for https://github.com/keleshev/schema/issues/139: I don't see why it is a problem, Schema({Optiona('a'): 1, Optiona('a'): 2}) can actually be useful, and silently removing one of them is clearly not any better.
  • The __repr__ of Schema has unfortunately more useless stuffs due to the precompilation, but this shouldn't be a problem (just had to update some tests). And the representation of Regex is less clean.
  • I changed the callback of Hook: it now takes key, value, new, data in order to be able to edit new (useful for https://github.com/keleshev/schema/issues/204 for instance), which is still compatible with the tests and most examples written.
  • I authorize using json_schema without schema_id, can be useful for generating a piece of JSON schema or an Open API schema.

For the new features, there are many:

  • Everything is as precompiled as possible, and Dict is much more clever on which schemas to try with which key (depending on the value and the type of the key), as the test shows: https://github.com/aure-olli/schema/blob/35e114ab2be859b9a5de9d812fd8333c29f0cab6/test_schema.py#L1053
  • Everything has the same base class: BaseSchema
  • As said, Hook is much more powerful, and can also define a catch(key, error, new, data) function for the case where the value is not matched.
  • Optional is now a Hook (that shows its power), and Clean is similar but discards the key instead of saving it (they both can be used together)
  • Regex can now have directly a compiled pattern, and has a regex_lib option to use other libraries as regex (https://pypi.org/project/regex/).
  • And mostly, json_schema has totally been rewritten. It is now much more powerful and recursive, tries to do clever merging to compact the schema, can represent regex and comparable, ... and can be specialized for JSON schema or Open API by passing target='json_schema' or target='openapi'.

I tried to be as compatible as possible with the current version such that it can be merged. I think everybody would be glad to have this updated version. However, being a total refactoring, it clearly requires some discussions.

Thank you.

aure-olli avatar May 14 '19 04:05 aure-olli

Thank you for this, it sounds extremely interesting. I would be open to merging some (or all) of the functionality, but it would have to be piecemeal as one big diff is too hard to review :(

skorokithakis avatar May 17 '19 23:05 skorokithakis

Hello. Yeah I agree but this is a refactoring :/ I'll see what I can do about it once I'm satisfied with the results.

aure-olli avatar May 22 '19 02:05 aure-olli

I know, it just touches so many things that it's hard to review together :/

skorokithakis avatar May 22 '19 09:05 skorokithakis

Having OpenAPI generation is very interesting, that would avoir having to rewrite the same thing twice.

cglacet avatar Aug 10 '22 10:08 cglacet