schema icon indicating copy to clipboard operation
schema copied to clipboard

Simultaneously return data, plus errors of offending keys

Open jtlz2 opened this issue 7 years ago • 12 comments

Use case: I have =~ 20 fields to (cross)-validate, and I need the data where validation works and no data/the relevant error message where it doesn't.

Is there an existing way to combine the functionality of validate() and SchemaError output in such a way that if a SchemaError is thrown because of a particular key, all the other data are returned but a SchemaError is returned for that key?

To borrow from another issue's example, suppose I have:

s = Schema({'a': And(int,
                 lambda x: x > 10,
                 lambda y: y % 2 == 0),
              'b': int})
x = s.validate({'a': 12, 'b': 'r'})

Is there any way - even a workaround for now - to return to x as:

[{'a':12},se] where se contains se.code and/or se.errors etc.

Even two separate returned arrays would be fine (better?).

I am trying to avoid handling exceptions for every single key (which would obviate the need for Schema I think), but perhaps this is unavoidable. It would be quite painful to run validate and then to go round for a second pass on arbitrarily-failing elements.

Love what you're doing - thanks!

PS: I think this could just be rephrased as: Getting SchemaError to return the validated (which could mean processed) data as well. Otherwise I could just combine the original data structure with the SchemaError output, right?

*PPS Could I just make use of #169 ? The problem is that then I would need a separate try... except block for every constituent atomic test (plus combinations).

jtlz2 avatar Oct 08 '18 12:10 jtlz2

@jtlz2 I've modified #169 to reflect the ideas in this comment let me know your thoughts.

It seems like you could write a "reducer" that looks something like this:

s1 = Schema({'a': And(int,
                 lambda x: x > 10,
                 lambda y: y % 2 == 0),
              'b': int})

def return_errors(schemas):
    merged = And(*schemas)
    def validator(value):
        try:
            return merged.validate(value)
        except SchemaError as error:
            return error
    return validator

s2 = compose(s1, reduce=return_errors)
valid = s2.validate({'a': 12, 'b': 'r'}) # returns {"a": 12, "b": error}

The resulting {"a": 12, "b": error} isn't exactly what you specified, but it's close.

This definitely isn't the intended purpose of compose, but you could use it in this way I suppose.

To get exactly what you're looking there'd have to be more fundamental changes to the wayschemas validate since what you're asking is for errors to "bubble up" from within the schemas and that's simply not supported as it exists now.

rmorshea avatar Oct 08 '18 18:10 rmorshea

@rmorshea This is amazing - thank you. I think it accomplishes what I need (and is infinitely preferable for at least trying it out than, say, the Cerberos route...).

Presumably there is nothing to stop one returning value (i.e. the raw, unvalidated field value) along with/instead of error (or for that matter anything else one fancies)?

How can I try out the modified code?

Thanks again!

jtlz2 avatar Oct 08 '18 19:10 jtlz2

@jtlz2 you can try it out by copying and pasting this gist.

edit

I just tried it out, the example I proposed doesn't quite work... trying to come up with a way to do it.

rmorshea avatar Oct 09 '18 00:10 rmorshea

Here's the updated example:

from schema import And, SchemaError, Use

s1 = Schema({'a': And(int,
                 lambda x: x > 10,
                 lambda y: y % 2 == 0),
              'b': int})

def return_errors(schemas):
    merged = Schema(schemas[0])
    def validator(value):
        try:
            return merged.validate(value)
        except SchemaError as error:
            return error
    return Use(validator)

s2 = compose(s1, s1, reduce=return_errors)
valid = s2.validate({'a': 12, 'b': 'r'}) # returns {"a": 12, "b": error}

This definitely exploits the behavior of the compose, but it works.

Note the use of compose(s1, s1, ...) this is required because we have to make compose think that it's merging two dictionaries so that it will visit each key-value pair and apply the return_errors reducer to them to resolve what it thinks are conflicting keys (note the schemas[0] in return_errors which just ignores the second s1 input).

I'm not sure we can call this a "solution" to the issue.

This may actually warrant a check in compose to see whether or not two values are == before applying the reducer. This would then break the "hack" of compose I proposed in the example above.

rmorshea avatar Oct 09 '18 00:10 rmorshea

I think the better solution to this is to write a function that recursively visits each key-value pair of dicts and wraps them in something like the return_errors reducer. Will post this function shortly...

rmorshea avatar Oct 09 '18 00:10 rmorshea

@rmorshea Thank you so so much - the updated example you give does precisely what I asked for and works for me. Amazing! So relieved to have a pythonic way to do all this.

I await the better solution with interest.

Hats off too for such a speedy response!

jtlz2 avatar Oct 09 '18 06:10 jtlz2

@jtlz2 here it is:

from schema import Schema, SchemaError, Use, _priority, ITERABLE, DICT


def silent_exceptions(schema):
    value = schema._schema if isinstance(schema, Schema) else schema
    schema_type = _priority(value)
    if schema_type == DICT:
        new = {}
        for k, v in [(k, value[k]) for k in value]:
            new[k] = silent_exceptions(v)
        return Schema(new)
    elif schema_type == ITERABLE:
        return Schema(list(map(silent_exceptions, value)))
    else:
        schema = Schema(value)
        def validate(x):
            try:
                return schema.validate(x)
            except SchemaError as error:
                return error
        return Use(validate)
from schema import Schema, And

s1 = Schema({'a': And(int,
                 lambda x: x > 10,
                 lambda y: y % 2 == 0),
              'b': int})
s2 = silent_exceptions(s1)
x = s2.validate({'a': 12, 'b': 'r'}) # returns {"a": 12, "b": error}

rmorshea avatar Oct 09 '18 18:10 rmorshea

Ultimately if something like this were to be merged into schema it would probably be as silent=True flag in Schema.validate.

Thoughts @skorokithakis?

rmorshea avatar Oct 09 '18 18:10 rmorshea

I think it's better if we merge your "composer" idea, @rmorshea, since it can be used for different use cases, and then have a section in the docs where we explain how to use it to return errors for keys that don't validate, rather than adding extra functionality for this.

skorokithakis avatar Oct 13 '18 15:10 skorokithakis

@rmorshea I have made great progress - for which many thanks - but have run into problems combining compose with Optional. It may be a problem with how I am using Optional - not sure. Are you able to advise?

from Schema import *
from collections import OrderedDict
s=Schema(dict(OrderedDict([(Optional('status',default='default'),str),('k',str)])))
x={'k':'v'}
y={'status':'present','k':'v'}
s.validate(x)
# {'k': 'v', 'status': 'default'}
s.validate(y)
# {'k': 'v', 'status': 'present'}

from schema_compose import compose
s2 = compose(s, s, reduce=return_errors) # return_errors as per your earlier definition
s2.validate(x)
# {'k': 'v'} # What happened to the default status!?
s2.validate(y)
# SchemaWrongKeyError: Wrong keys 'status' in {'status': 'present', 'k': 'v'} # How to handle when status IS present!?

Huge thanks.

jtlz2 avatar Oct 18 '18 10:10 jtlz2

Yup, I’ll update my gist, by the end of today. On Thu, Oct 18, 2018 at 3:47 AM jtlz2 [email protected] wrote:

@rmorshea https://github.com/rmorshea I have made great progress - for which many thanks - but have run into problems combining compose with Optional. It may be a problem with how I am using Optional - not sure. Are you able to advise?

from Schema import * from collections import OrderedDict

s=Schema(dict(OrderedDict([(Optional('status',default='default'),str),('k',str)]))) x={'k':'v'} y={'status':'present','k':'v'} s.validate(x) {'k': 'v', 'status': 'default'}

s.validate(y) {'k': 'v', 'status': 'present'}

from schema_compose import compose s2 = compose(s, s, reduce=return_errors) # return_errors as per your earlier definition s2.validate(x) {'k': 'v'} # What happened to the default status!?

s2.validate(y) SchemaWrongKeyError: Wrong keys 'status' in {'status': 'present', 'k': 'v'} # How to handle when status IS present!?

Huge thanks.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keleshev/schema/issues/171#issuecomment-430962962, or mute the thread https://github.com/notifications/unsubscribe-auth/AD2tBpck_PMuRQSYurS_AUfnxg2Q5n3gks5umFw1gaJpZM4XM3ze .

rmorshea avatar Oct 18 '18 15:10 rmorshea

@rmorshea No worries - thanks so much in advance!

jtlz2 avatar Oct 21 '18 09:10 jtlz2