pydantic-core icon indicating copy to clipboard operation
pydantic-core copied to clipboard

JSON parsing from bytes fails in `allow_partial` mode when the final bytes are not a valid unicode point.

Open dmontagu opened this issue 8 months ago • 0 comments

I think when allowing partial JSON, and parsing from bytes, we shouldn't get errors if the trailing characters are not valid unicode.

Example demonstrating misbehavior:

from pydantic_core import SchemaValidator, core_schema

my_partial_string = '"abc€'  # works fine if you replace with = '"abc€d'
non_unicode_partial_string_bytes = my_partial_string.encode()[:-1]
SchemaValidator(core_schema.any_schema()).validate_json(non_unicode_partial_string_bytes, allow_partial='trailing-strings')
"""
pydantic_core._pydantic_core.ValidationError: 1 validation error for any
  Invalid JSON: invalid unicode code point at line 1 column 6 [type=json_invalid, input_value=b'"abc\xe2\x82', input_type=bytes]
    For further information visit https://errors.pydantic.dev/2.10/v/json_invalid
"""

dmontagu avatar Feb 12 '25 22:02 dmontagu