python-fastjsonschema icon indicating copy to clipboard operation
python-fastjsonschema copied to clipboard

Getting UnboundLocalError

Open jinsooihm opened this issue 2 years ago • 3 comments

Hello,

I am getting a weird error when trying to run validate.

data = ["str"]
schema = {"minItems":1, "minLength": 1}
fastjsonschema.validate(schema, data)

gives

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../fastjsonschema/__init__.py", line 114, in validate
    return compile(definition, handlers, formats, use_default)(data)
  File "<string>", line 10, in validate
UnboundLocalError: local variable 'data_len' referenced before assignment

It seems like something goes wrong when using minItems/maxItems with minLength/maxLength?

jinsooihm avatar Oct 17 '22 15:10 jinsooihm

@jinsoo960 Arrays can be validated with minItems/maxItems while Strings can be validated with minLength/maxLength as per the JSON Schema specification. There is an implicit type check (list/tuple for minItems/maxItems, str for minLength/maxLength) along with the length validation. Using minItems along with minLength is arguably an illegal operation since they validate different data types.

https://json-schema.org/draft/2020-12/json-schema-validation.html#name-minlength https://json-schema.org/draft/2020-12/json-schema-validation.html#name-maxlength https://json-schema.org/draft/2020-12/json-schema-validation.html#name-maxitems https://json-schema.org/draft/2020-12/json-schema-validation.html#name-minitems

melroy-tellis avatar Dec 10 '22 15:12 melroy-tellis

Both minItems/maxItems and minLength/maxLength are using the same generator function create_variable_with_length, this generator function defines a variable {variable}_len to hold the length of the passed variable. This variable is defined in the generated code only if the object is of the appropriate type (str for minLength/maxLength, list/tuple for minItems/maxItems). However, the generator will always add it to its internal set of tracked variables (self._variables).

For instance, if the schema is:

{
   "minItems": 1,
  "minLength": 1
}

Since both minLength and minItems are defined, the generator will generate an if block with {variable}_len defined for validating minItems and add {variable_len} to its internal set of generated variables. When it generates the if block for validating minLength, it will not define {variable}_len since it has previously added it to its set of tracked variables.

The code block generated would be roughly of the form:

{variable}_is_list = isinstance({variable}, (list, tuple))
if {variable}_is_list:
    {variable}_len = len({variable})
    if {variable}_len < {minItems}:
        raise JSONSchemaException(msg)
if isinstance({variable}, str):
    if {variable}_len < {minLength}:
        raise JSONSchemaException(msg)

Note that {variable}_len is not defined in the second if block. If the passed object to validate is a string and not a list, {variable}_len would not be defined in the second if block as {variable}_is_list would be False and the contents of the first if block would not be evaluated.

My proposed fix is to add a separate generator function create_variable_with_items which creates a variable {variable}_items to hold the length of the variable and call this function from generate_min_items/generate_max_items instead thus resolving the variable name conflict described above. Now, minItems will effectively be ignored if the passed object is not a list or tuple while minLength will effectively be ignored if the passed object is not a str.

This is consistent with the behaviour of the online JSON Schema validator Hyperjump.

melroy-tellis avatar Dec 10 '22 17:12 melroy-tellis

@horejsek - I have raised a PR for this.

melroy-tellis avatar Dec 10 '22 17:12 melroy-tellis