cerberus icon indicating copy to clipboard operation
cerberus copied to clipboard

conditional normalization using oneof

Open linupi opened this issue 5 years ago • 8 comments

Used Cerberus version / latest commit: current master

  • [x] I have the capacity to improve the docs when my problem is solved.
  • [x] I have the capacity to submit a patch when a bug is identified.
  • [x] My question does not concern a practical use-case that I can't figure out to solve.

Use-case abstract

So far Cerberus does not allow for conditional default values which is supported by other validators (e.g. in Schema using Or ). Here pledge for a new feature that adds the possibility to define more complex defaults i.g. where the default value depends on another field. This feature can be added without any new rule simply by considering oneofduring normalization.


Feature request

It would be nice to consider oneof also for normalization. While other of-rules (anyof, noneof, alloff) are not unambiguous in the context of normalization oneof instead could be used to insert conditional default values.

e.g.

    schema = {
        'foo': {'type': 'string'},
        'bar' :{'oneof':[{'dependencies':{"foo":"B"}, 'default':"B"}, {'dependencies':{"foo":"A"}, 'default':"C"}]}
    }
    document = {'foo': 'A'}
    expected_normalization_result = {'foo': 'A','bar':"C"}

linupi avatar Nov 18 '19 12:11 linupi

while you are correct that conditional normalization with the oneof-rule would seem to make sense, i still object this approach because of the inconsistency and additional complexity it introduces.

since you want to set a default value that depends on the document context, i propose to use a default_setter with the extension proposed in #279. i have just started a branch where that is supposed to be implemented.

funkyfuture avatar Nov 19 '19 16:11 funkyfuture

thank you for your fast reply @funkyfuture and sorry that it took me so long to react.

At first I experimented with default_setter as well, but I didn't like it too much because it always sets a default, so it is not possible to insert a default only under certain conditions.

On the validation side I really like the extensibility of Cerberus using _validate_MyNewMethod, however I was lacking the equivalent for the normalization. That is how I started to think about the oneof for normalization, in order to avoid the need of extending Cerberus rules.

Following what is proposed in #279 I agree that the complexity inside the code might be reduced, but I am afraid that it makes the definition of the schema more complex and as far as I can see it might introduces redundancy in the schema definition as well when use for the problem that I tried to address here. However, I'd be happy to have a look once the corresponding branch is online - maybe my first impression is wrong.

Above you mention that the proposed oneof-normalization introduces inconsistency. Just out of curiosity I'd like to understand what aspects you are referring, as I can't see any new inconsistency yet, but rather think it would reduce the number of cases where a provide mapping is validated as valid, but not normalized.

linupi avatar Nov 27 '19 18:11 linupi

don't worry, i'm short of time as well.

but I didn't like it too much because it always sets a default, so it is not possible to insert a default only under certain conditions.

see #438 and amend if you have additional thoughts.

Above you mention that the proposed oneof-normalization introduces inconsistency.

yes, because the oneof-rule was introduced with, is implemented alike and considered by users as part of the *of-rules. the second aspect would change dramatically with your proposal, and the last one will lead to further (unbearable) feature requests involving the other rules.

i'm in a hurry, but let me try to point out what my overall view is on the matter:

Cerberus is great to make simple schemas that solve rather simple problems. of course there are more complex problems in the world, but Cerberus could only solve these with a more complex grammar (it doesn't have a formalized one, btw) which would render the solutions to simple problems to be more complex. in my experience, users often expect some kind of magic and implicit handling for a given combination of rules (and constraints) that are absolutely obvious and unambiuous in the scope of the problem they tackle. but they often fail the problems of other users when abstracted. and the code gets much less maintainable with each of such implications. one very aspect that concerns your proposal: normalization and validation are two different stages (each can be skipped) and they work very differently. bridging these phases is possible, and we do that for two or three rules, but it's quickly getting messy with the amount of rules. in the future i hope to reduce that number and to implement a framework that allows to declare complex rules in a clean way.

having that said, i really want to advise you that you take a complex problem as such. that improvement of the framework will take a while, isn't guaranteed to succeed and i'll be happy to have your feedback on it.

funkyfuture avatar Nov 29 '19 18:11 funkyfuture

To me Cerberus is already quite a strong tool that manages to deal with rather complex problems using a simple grammar, that is one of the reasons why I really like Cerberus.

Looking at my proposal I, currently define a schema like this:

schema = {
    'foo': {'type': 'string'},
    'bar': {
        'oneof': [
            {'dependencies': {"foo": "B"}, 'default': "B"},
            {'dependencies': {"foo": "A"}, 'default': "C"},
        ]
    },
}

how would you like if the normalization role of oneof would be more explicit e.g. by defining a default_setter like this:

schema = {
    'foo': {'type': 'string'},
    'bar': {
        'oneof': [
            {'dependencies': {"foo": "B"}, 'default': "B"},
            {'dependencies': {"foo": "A"}, 'default': "C"},
        ]
    },
    default_setter = 'oneof'           # explicitly defining that oneof is setting the default
}

like this the there is a clear cut between the role of 'oneof' in the validation phase and in the normalization phase. I actually would have liked to go this way, but unfortunately the signature of the function called by _normalize_default_setter does not allow for the transmission of a schema. Would it be imaginable to extend the signature of the setter called in _normalize_default_setter to something like mapping, schema, field instead of just mapping (at least in the case where the setter is provided as extension of the validator and not as an external callable)? In that case I could rewrite my proposal ... if it is considered as potentially sustainable...

linupi avatar Dec 02 '19 16:12 linupi

any update on this issue? Today I just meet the problem as you guys. By the way, I think the proposal above would be ok if implemented.

hulb avatar Oct 20 '20 08:10 hulb

I am not aware of any progress on this, For the moment we are using a patched Validator that does oneof normalization that is somehow customized to our project but of cause it would be great to have this sort of possibility in cerberus itself.

In case you want to have a look, here is what we use: https://gitlab.esrf.fr/bliss/bliss/-/blob/master/bliss/common/validator.py https://gitlab.esrf.fr/bliss/bliss/-/blob/master/tests/common/test_validator.py

linupi avatar Oct 21 '20 12:10 linupi

thank you, it's helpful!

hulb avatar Oct 22 '20 03:10 hulb

I'm not sure but we might be running into the same issue as described here. We have a oneof defined with two possible schemas below it. For one of these schemas we wanted to set a default value for one of its fields, which seems like a pretty reasonable thing to do I think. It didn't work though and it took us a bit to figure out that that was because of the oneof. Would this case be covered by the issue as described here?

simonvanderveldt avatar Jun 09 '21 11:06 simonvanderveldt

i'm closing this now. my main point would still be that it introduces inconsistency in the design and code complexity for rather rare cases.

as a recommendation for all your cases, i wanna remind that Cerberus was initally designed for validation, not normalization, and point at the possibility to do these things in subsequent steps. not all steps need to use Cerberus.

funkyfuture avatar Oct 24 '22 12:10 funkyfuture