schema
schema copied to clipboard
Wrong priority for Optional param
>>> schema = Schema({Optional('id'): Use(int), str: str})
>>> schema.validate({'id': '23', 'other': 'asf'})
>>> {'id': '23', 'other': 'asf'}
I think value of id should be converted to integer 23 instead of string '23'("by Use(int)").But "str: str" has a higher priority than "Optional('id')", so it just ignores the "Use(int)" statement. I check the code about priority
def priority(s):
if type(s) in (list, tuple, set, frozenset):
return 6
if type(s) is dict:
return 5
if hasattr(s, 'validate'):
return 4
if type(s) is type:
return 3
if callable(s):
return 2
else:
return 1
Optional has a 'validate' attribute which is inherited from Schema class, so it return 4 here, higher than 3 of the str case.
So you propose switching around the priorities?
I wonder, what impact that would have on other cases...
I am afraid of that impact will bite too.Currently, I think "hasattr(s, 'validate')"(Optional, And, Or...) is more specifical than "is type"(str, object...)
Same issue here. But in my case, even worse!
Here is what I have:
import schema
s = schema.Schema({
'a': Use(int),
'b': Use(int),
schema.Optional('c'): Use(int),
schema.Optional(object): object,
})
Here I expect that 'a', 'b' are always required, 'c' is optional, and all other parameters are optional. I mean, if 'c' is present, it should be always validated. But look what happens:
keys = sorted([(schema.priority(k), k) for k, v in s._schema.items()])
import pprint; pprint.pprint(keys)
[(1, 'a'),
(1, 'b'),
(4, Optional(<type 'object'>)),
(4, Optional('c'))]
It means, that 'c' will never be validated. What is even worse, that this creates race condition, since all Optional keys will be validated in random order ('c' gets validated randomly), that's why it took me 4 hours, to debug this issue, when my tests started to break randomly.
I also have the same issue. I'm trying to validate some query arguments from a web request:
s = Schema({
Optional('start'): Use(int),
basestring: object
})
s.validate({'start': '1'})
{'start': '1'} # matched basestring: object
'1' is not converted to int because basestring matches before Optional('start')
As already pointed out by @apieceofredcloth, I would suggest to increase the priority of the hasattr(s, 'validate') case.
Another solution may be to have the classes derived from Schema (such as Optional) expose the same priority of the underlying schema, so Optional('start') would have a priority of 1, while Optional(str) would have a priority of 3.
I can create a pull request if needed.
This isn't fully fixed by #52. Here is a unit test that (sometimes) reproduces the problem i'm seeing on my machine. I'm not sure why it doesn't consistently hit?
diff --git a/test_schema.py b/test_schema.py
index 4ec9993..9feebf8 100644
--- a/test_schema.py
+++ b/test_schema.py
@@ -154,6 +154,9 @@ def test_dict_optional_keys():
# Make sure Optionals are favored over types:
assert Schema({basestring: 1,
Optional('b'): 2}).validate({'a': 1, 'b': 2}) == {'a': 1, 'b': 2}
+ assert Schema({str: {
+ Optional('a', default=False): bool, Optional(str): object}}).\
+ validate({'c': {'a': True, 'b': 2}}) == {'c': {'a': True, 'b': 2}}
Test output:
E assert {'c': {'a': False, 'b': 2}} == {'c': {'a': True, 'b': 2}}
E Differing items:
E {'c': {'a': False, 'b': 2}} != {'c': {'a': True, 'b': 2}}
E Use -v to get the full diff
The reason for this is the ordering. I printed out the ordered keys and their relative priority with this diff:
diff --git a/schema.py b/schema.py
index f36ab1d..fd9df58 100644
--- a/schema.py
+++ b/schema.py
@@ -116,6 +116,8 @@ class Schema(object):
covered_optionals = set()
# for each key and value find a schema entry matching them, if any
sorted_skeys = list(sorted(s, key=priority))
+ print('sorted_skeys', sorted_skeys)
+ print([(y, priority(y)) for y in sorted_skeys])
for key, value in data.items():
valid = False
skey = None
Output:
('sorted_skeys', [<type 'str'>])
[(<type 'str'>, 3)]
('sorted_skeys', [Optional(<type 'str'>), Optional('a')])
[(Optional(<type 'str'>), 2), (Optional('a'), 2)]
The correct ordering should be: [(Optional('a'), 2), (Optional(<type 'str'>), 2)]
I whipped together a quick patch, but it feels dirty and I don't think it fixes the general case...
diff --git a/schema.py b/schema.py
index f36ab1d..c9c666e 100644
--- a/schema.py
+++ b/schema.py
@@ -73,7 +73,8 @@ class Use(object):
raise SchemaError('%s(%r) raised %r' % (f, data, x), self._error)
-COMPARABLE, CALLABLE, VALIDATOR, TYPE, DICT, ITERABLE = range(6)
+COMPARABLE, CALLABLE, VALIDATOR_VAL, VALIDATOR_TYPE, TYPE, DICT, ITERABLE = \
+ range(7)
def priority(s):
@@ -85,7 +86,9 @@ def priority(s):
if issubclass(type(s), type):
return TYPE
if hasattr(s, 'validate'):
- return VALIDATOR
+ if hasattr(s, '_schema') and issubclass(type(s._schema), type):
+ return VALIDATOR_TYPE
+ return VALIDATOR_VAL
if callable(s):
return CALLABLE
else:
@@ -163,7 +166,7 @@ class Schema(object):
return data
else:
raise SchemaError('%r should be instance of %r' % (data, s), e)
- if flavor == VALIDATOR:
+ if flavor in (VALIDATOR_TYPE, VALIDATOR_VAL):
try:
return s.validate(data)
except SchemaError as x:
These days, I have encountered the same problem as below
def test_optional_key_convert_failed_randomly_while_with_another_optional_object():
"""
In this test, created_at string "2015-10-10 00:00:00" is expected to be converted to a datetime instance.
- it works when the schema is
s = Schema({
'created_at': _datetime_validator,
Optional(basestring): object,
})
- but when wrapping the key 'created_at' with Optional, it fails randomly
:return:
"""
import datetime
datetime_fmt = '%Y-%m-%d %H:%M:%S'
_use_datetime = lambda i: datetime.datetime.strptime(i, datetime_fmt)
_datetime_validator = Or(None, Use(_use_datetime))
s = Schema({
Optional('created_at'): _datetime_validator,
Optional(basestring): object,
})
data = {
'created_at': '2015-10-10 00:00:00'
}
validated_data = s.validate(data)
# is expected to be converted to a datetime instance, but fails randomly(most of the time)
# assert isinstance(validated_data['created_at'], datetime.datetime)
assert isinstance(validated_data['created_at'], basestring)