pydantic `pydantic.Json` should enforce that dict keys may only be of type `str`

In a JSON object, only strings are valid keys. This could be enforced through the pydantic.Json type, which currently has some weird casting or error behaviours for various invalid key types.

For integers, which are valid Python but not JSON keys, it would be nice to make this a definition-time error but the runtime behaviour is really weird:

from typing import Dict, List
from pydantic import BaseModel, Json

class IntKeys(BaseModel):
    x: Json[Dict[int, int]]  # This type is not valid JSON

>>> IntKeys(x='{1: [2]}')  # Note: this string is invalid JSON
ValidationError: 1 validation error for IntKeys
x
  Invalid JSON (type=value_error.json)

>>> IntKeys(x='{"1": [2]}')  # Note: valid JSON, but wrong key type - pydantic casts it to int
IntKeys(x={1: [2]})

Lists are not valid Python keys, and Pydantic doesn't seem to try casting them either. This makes much more sense but a definition-time error would still be nice.

class ListKeys(BaseModel):
    x: Json[Dict[List[int], int]]  # And this isn't even valid Python

>>> ListKeys(x='{[]: 2}')
ValidationError: 1 validation error for ListKeys
x
  Invalid JSON (type=value_error.json)

>>> ListKeys(x='{"[]": 2}')
ValidationError: 1 validation error for ListKeys
x -> __key__
  value is not a valid list (type=type_error.list)

Like #2095, I found these while writing tests for #2017.

Nov 06 '20 03:11 Zac-HD

I'm not sure x: Json[Dict[int, int]] should be invalid actually.

You may agree or disagree with pydantic's enthusiasm for coercion over validation (e.g. if you have an int field, "1" will automatically be coerced to an int), but that's how pydantic is.

With that convention, I think most people would expected x: Json[Dict[int, int]] to work too, e.g JSON '{"123": "321"}' should be coerced to {123: 321}.

There are numerous discussion about this (filter issues by the "strictness" label) and I'm very open to changing the behaviour on future. But while pydantic is the way it is, I don't think we should change this. Even if we change it for some types, I suspect most people would want strings to be coerced to ints - think about text-only situations like environment variables and URL parameters.

With Dict[List[int], int] I agree this is invalid and a class-creation-time check would be good, but that sounds like a separate feature and quite complex to implement.

Nov 06 '20 11:11 samuelcolvin

I agree with @samuelcolvin that this should not raise an error at definition time.

It does, however leave open the issue of "surprising" errors while exporting to json. I just tripped up on this with regards to UUID.

I'd like to request that something more is done to facilitate exporting dictionaries to json such that they can be read by Pydantic. Pydantic already has mechanisms in place to coerce exotic data types to/from strings. It's most surprising to discover that this doesn't happen with dictionary

keys.

Example is:

from pydantic import BaseModel
from uuid import UUID
from typing import Dict, List
from datetime import datetime

class Wigit(BaseModel):
    id: UUID
    name: str

class WigitRecord(BaseModel):
    timestamp: datetime
    values: Dict[str, float]

class WigitRecordSet(BaseModel):
    origin_timestamp: datetime
    records: Dict[UUID, List[WigitRecord]]

It's very surprising to discover that you can Wigit.parse_raw(item.json()) and WigitRecord.parse_raw(item.json()) when the same components are put together a little differently in WigitRecordSet.parse_raw(item.json()), it trips up on an error:

TypeError: keys must be str, int, float, bool or None, not UUID

Are there already any workarounds for getting .json() to work with non-string keys?

Is it possible to fix this in Pydantic so that it just works?

Apr 07 '21 12:04 couling

I would also be very interested in a solution/workaround.

Mar 16 '22 12:03 tharwan

I have this same error when calling .json() on a model which has a field of type dict[UUID, Any].

Aug 23 '22 10:08 Kevin-Lean

I acknowledge the problem, we'll need to find a fix in V2.

Aug 23 '22 10:08 samuelcolvin

I believe this is fixed in v2:

from typing import Any
from uuid import UUID

from pydantic import BaseModel


class Model(BaseModel):
    x: dict[UUID, Any]


m = Model(x={'00000000-0000-0000-0000-000000000000': 1})
print(m)
#> x={UUID('00000000-0000-0000-0000-000000000000'): 1}
assert m.model_dump_json() == '{"x":{"00000000-0000-0000-0000-000000000000":1}}'

If there are any similar issues, please report them — I think we now have the infrastructure necessary to resolve serialization behavior specifically for JSON in a more reliable way, so should be able to address similar issues more easily.

Apr 27 '23 19:04 dmontagu

pydantic pydantic copied to clipboard

`pydantic.Json` should enforce that dict keys may only be of type `str`

pydantic
pydantic copied to clipboard