jsonschema icon indicating copy to clipboard operation
jsonschema copied to clipboard

Validating regex patterns in schema

Open maingoh opened this issue 3 years ago • 1 comments

I try to validate the user json schema using DraftXValidator.check_schema. But it appears that the validator doesn't check that the regex is valid. I get a crash later when validating the schema against some data:

>>> schema = {'type':'string', 'pattern': '*hello'}
>>> jsonschema.validate('foo', schema=schema)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/site-packages/jsonschema/validators.py", line 1020, in validate
    error = exceptions.best_match(validator.iter_errors(instance))
  File "/usr/local/lib/python3.8/site-packages/jsonschema/exceptions.py", line 356, in best_match
    best = next(errors, None)
  File "/usr/local/lib/python3.8/site-packages/jsonschema/validators.py", line 229, in iter_errors
    for error in errors:
  File "/usr/local/lib/python3.8/site-packages/jsonschema/_validators.py", line 230, in pattern
    and not re.search(patrn, instance)
  File "/usr/local/lib/python3.8/re.py", line 201, in search
    return _compile(pattern, flags).search(string)
  File "/usr/local/lib/python3.8/re.py", line 304, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/usr/local/lib/python3.8/sre_compile.py", line 764, in compile
    p = sre_parse.parse(p, flags)
  File "/usr/local/lib/python3.8/sre_parse.py", line 948, in parse
    p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
  File "/usr/local/lib/python3.8/sre_parse.py", line 443, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
  File "/usr/local/lib/python3.8/sre_parse.py", line 668, in _parse
    raise source.error("nothing to repeat",
re.error: nothing to repeat at position 0

This could be anticipated if check_schema was doing a re.compile on the pattern:

Draft202012Validator.check_schema({'type':'string', 'pattern': '*hello'})

maingoh avatar Jan 13 '22 09:01 maingoh

This is due to check_schema validating schemas via format annotation rather than assertion. Technically that's unfortunately not incorrect behavior, even if it's a footgun. I'll leave this open and consider what to do about it.

Julian avatar Jul 31 '22 08:07 Julian