schema icon indicating copy to clipboard operation
schema copied to clipboard

Wrong priority for Optional param

Open apieceofredcloth opened this issue 12 years ago • 6 comments

>>> schema = Schema({Optional('id'): Use(int), str: str})
>>> schema.validate({'id': '23', 'other': 'asf'})

>>> {'id': '23', 'other': 'asf'}

I think value of id should be converted to integer 23 instead of string '23'("by Use(int)").But "str: str" has a higher priority than "Optional('id')", so it just ignores the "Use(int)" statement. I check the code about priority

def priority(s):
    if type(s) in (list, tuple, set, frozenset):
        return 6
    if type(s) is dict:
        return 5
    if hasattr(s, 'validate'):
        return 4
    if type(s) is type:
        return 3
    if callable(s):
        return 2
    else:
        return 1

Optional has a 'validate' attribute which is inherited from Schema class, so it return 4 here, higher than 3 of the str case.

apieceofredcloth avatar Sep 13 '13 08:09 apieceofredcloth

So you propose switching around the priorities?

I wonder, what impact that would have on other cases...

keleshev avatar Sep 13 '13 13:09 keleshev

I am afraid of that impact will bite too.Currently, I think "hasattr(s, 'validate')"(Optional, And, Or...) is more specifical than "is type"(str, object...)

apieceofredcloth avatar Sep 15 '13 03:09 apieceofredcloth

Same issue here. But in my case, even worse!

Here is what I have:

import schema

s = schema.Schema({
    'a': Use(int),
    'b': Use(int),
    schema.Optional('c'): Use(int),
    schema.Optional(object): object,
})

Here I expect that 'a', 'b' are always required, 'c' is optional, and all other parameters are optional. I mean, if 'c' is present, it should be always validated. But look what happens:

keys = sorted([(schema.priority(k), k) for k, v in s._schema.items()])
import pprint; pprint.pprint(keys)
[(1, 'a'),
 (1, 'b'),
 (4, Optional(<type 'object'>)),
 (4, Optional('c'))]

It means, that 'c' will never be validated. What is even worse, that this creates race condition, since all Optional keys will be validated in random order ('c' gets validated randomly), that's why it took me 4 hours, to debug this issue, when my tests started to break randomly.

sirex avatar Jul 29 '14 15:07 sirex

I also have the same issue. I'm trying to validate some query arguments from a web request:

s = Schema({
        Optional('start'): Use(int),
        basestring: object
    })

s.validate({'start': '1'})
{'start': '1'}   # matched basestring: object

'1' is not converted to int because basestring matches before Optional('start')

As already pointed out by @apieceofredcloth, I would suggest to increase the priority of the hasattr(s, 'validate') case.

Another solution may be to have the classes derived from Schema (such as Optional) expose the same priority of the underlying schema, so Optional('start') would have a priority of 1, while Optional(str) would have a priority of 3.

I can create a pull request if needed.

acaprari avatar Jan 14 '15 15:01 acaprari

This isn't fully fixed by #52. Here is a unit test that (sometimes) reproduces the problem i'm seeing on my machine. I'm not sure why it doesn't consistently hit?

diff --git a/test_schema.py b/test_schema.py
index 4ec9993..9feebf8 100644
--- a/test_schema.py
+++ b/test_schema.py
@@ -154,6 +154,9 @@ def test_dict_optional_keys():
     # Make sure Optionals are favored over types:
     assert Schema({basestring: 1,
                    Optional('b'): 2}).validate({'a': 1, 'b': 2}) == {'a': 1, 'b': 2}
+    assert Schema({str: {
+        Optional('a', default=False): bool, Optional(str): object}}).\
+            validate({'c': {'a': True, 'b': 2}}) == {'c': {'a': True, 'b': 2}}

Test output:

E       assert {'c': {'a': False, 'b': 2}} == {'c': {'a': True, 'b': 2}}
E         Differing items:
E         {'c': {'a': False, 'b': 2}} != {'c': {'a': True, 'b': 2}}
E         Use -v to get the full diff

The reason for this is the ordering. I printed out the ordered keys and their relative priority with this diff:

diff --git a/schema.py b/schema.py
index f36ab1d..fd9df58 100644
--- a/schema.py
+++ b/schema.py
@@ -116,6 +116,8 @@ class Schema(object):
             covered_optionals = set()
             # for each key and value find a schema entry matching them, if any
             sorted_skeys = list(sorted(s, key=priority))
+            print('sorted_skeys', sorted_skeys)
+            print([(y, priority(y)) for y in sorted_skeys])
             for key, value in data.items():
                 valid = False
                 skey = None

Output:

('sorted_skeys', [<type 'str'>])
[(<type 'str'>, 3)]
('sorted_skeys', [Optional(<type 'str'>), Optional('a')])
[(Optional(<type 'str'>), 2), (Optional('a'), 2)]

The correct ordering should be: [(Optional('a'), 2), (Optional(<type 'str'>), 2)]

I whipped together a quick patch, but it feels dirty and I don't think it fixes the general case...

diff --git a/schema.py b/schema.py
index f36ab1d..c9c666e 100644
--- a/schema.py
+++ b/schema.py
@@ -73,7 +73,8 @@ class Use(object):
             raise SchemaError('%s(%r) raised %r' % (f, data, x), self._error)


-COMPARABLE, CALLABLE, VALIDATOR, TYPE, DICT, ITERABLE = range(6)
+COMPARABLE, CALLABLE, VALIDATOR_VAL, VALIDATOR_TYPE, TYPE, DICT, ITERABLE = \
+    range(7)


 def priority(s):
@@ -85,7 +86,9 @@ def priority(s):
     if issubclass(type(s), type):
         return TYPE
     if hasattr(s, 'validate'):
-        return VALIDATOR
+        if hasattr(s, '_schema') and issubclass(type(s._schema), type):
+            return VALIDATOR_TYPE
+        return VALIDATOR_VAL
     if callable(s):
         return CALLABLE
     else:
@@ -163,7 +166,7 @@ class Schema(object):
                 return data
             else:
                 raise SchemaError('%r should be instance of %r' % (data, s), e)
-        if flavor == VALIDATOR:
+        if flavor in (VALIDATOR_TYPE, VALIDATOR_VAL):
             try:
                 return s.validate(data)
             except SchemaError as x:

marc2982 avatar Aug 14 '15 22:08 marc2982

These days, I have encountered the same problem as below

def test_optional_key_convert_failed_randomly_while_with_another_optional_object():
    """
    In this test, created_at string "2015-10-10 00:00:00" is expected to be converted to a datetime instance.
        - it works when the schema is

            s = Schema({
                    'created_at': _datetime_validator,
                    Optional(basestring): object,
                })

        - but when wrapping the key 'created_at' with Optional, it fails randomly
    :return:
    """
    import datetime
    datetime_fmt = '%Y-%m-%d %H:%M:%S'
    _use_datetime = lambda i: datetime.datetime.strptime(i, datetime_fmt)
    _datetime_validator = Or(None, Use(_use_datetime))
    s = Schema({
        Optional('created_at'): _datetime_validator,
        Optional(basestring): object,
    })
    data = {
        'created_at': '2015-10-10 00:00:00'
    }
    validated_data = s.validate(data)
    # is expected to be converted to a datetime instance, but fails randomly(most of the time)
    # assert isinstance(validated_data['created_at'], datetime.datetime)

    assert isinstance(validated_data['created_at'], basestring)

dylan4224 avatar Oct 10 '15 15:10 dylan4224