pydantic
pydantic copied to clipboard
order on `extra` fields not respected
Feature Request
Among many things we're using Pydantic for (thank you for this amazing project) is as an interim step for a report / data export engine. On some exports, I know the form of half of the columns & want to validate those. The other half, I don't need to validate.
I use the extra
flag on that so the object is flexible enough w/o having to create a new model dynamically. What I would expect: Pydantic would take & create new fields per extra key on the class in the order that the field is received. What happens: new fields are created, but order isn't respected - see the example below.
Feels like a bug; is this the expected behavior? Given the importance that Pydantic places on field order I'd assume the answer is no but wanted to get your thoughts.
Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())"
:
pydantic version: 1.4
pydantic compiled: False
install path: /Users/leo/.local/share/virtualenvs/services-PFQGUeYK/lib/python3.8/site-packages/pydantic
python version: 3.8.1 (default, Dec 27 2019, 18:06:00) [Clang 11.0.0 (clang-1100.0.33.16)]
platform: macOS-10.15.2-x86_64-i386-64bit
optional deps. installed: ['email-validator']
from pydantic import BaseModel
class CSVRow(BaseModel):
id: int
name: str
address: str
class Config:
extra = "allow"
row: dict = {"id": 1, "name": "Jon Snow", "address": "100 Winterfell", "Sword Name": "Longclaw", "Birth Year": "281 AC", "Birth Place": "Tower of Joy", "Lovers": ["Ygritte", "Danerys Targaryen"]}
validated_row: CSVRow = CSVRow(**row)
validated_row
# CSVRow(id=1, name='Jon Snow', address='100 Winterfell', Birth Year='281 AC', Birth Place='Tower of Joy', Lovers=['Ygritte', 'Danerys Targaryen'], Sword Name='Longclaw')
validated_row.dict()
# {'id': 1, 'name': 'Jon Snow', 'address': '100 Winterfell', 'Birth Year': '281 AC', 'Birth Place': 'Tower of Joy', 'Lovers': ['Ygritte', 'Danerys Targaryen'], 'Sword Name': 'Longclaw'}
validated_row.json()
# '{"id": 1, "name": "Jon Snow", "address": "100 Winterfell", "Birth Year": "281 AC", "Birth Place": "Tower of Joy", "Lovers": ["Ygritte", "Danerys Targaryen"], "Sword Name": "Longclaw"}'
I've changed the label from bug to feature request since we don't make any guarantee about this working the other way at present.
The reason for this is:
https://github.com/samuelcolvin/pydantic/blob/645e5fe6a0c74c95c24410571e9bb804af3eb677/pydantic/main.py#L857-L869
This is the most performant way (I know of) to achieve what's currently required, however I would consider a change to maintain order.
But it would depend on the performance impact, if it's too large I'd rather avoid the change.
Gotcha - I changed the header on the original issue to reflect your label change.
Would you want ordering on extra fields to be an option on config & split the difference? Or is the additional complexity not worth that consideration to you?
Given the relatively substantial performance implications of using a list vs. sets as currently implemented, by performance impact I'd imagine it would be hard to justify. unless you introduce an ordered implementation of set, a la something like https://github.com/LuminosoInsight/ordered-set, which sort of splits the difference but slows things down a little bit).
IIUC the feature request is about key ordering at a model level. But maybe an intermediate option with the extra fields ordered after the declared ones could be implemented with no performance impact. This wouldn't resolve the feature request but it seems a valuable feature by itself and maybe a starting point.
names_used
is only employed to evaluate extra
. An oversimplified and far-from-reality example that would set extra
before the fields iteration and modify it in place suggests that this strategy could be more performant (and less memory-consuming) while avoiding the unordering introduced by the set:
from string import ascii_letters as L
DECLARED, EXTRA = 5, 5
input_data = {L[k]: k for k in range(DECLARED + EXTRA)}
fields = [L[k] for k in range(DECLARED)]
def test1():
# unordered extra
names_used = set()
for f in fields:
names_used.add(f)
extra = input_data.keys() - names_used
def test2():
# ordered extra
extra = list(input_data)
for f in fields:
extra.remove(f)
if __name__ == '__main__':
import timeit
print(timeit.timeit("test1()", setup="from __main__ import test1"))
print(timeit.timeit("test2()", setup="from __main__ import test2"))
+1 on this feature request. I've run into this field order problem when creating tests for a FastAPI application. Using .dict()
seems to keep field order, but .json()
fails regularly.
from pydantic import BaseModel
class TestInput(BaseModel):
class config:
extra = "allow"
class TestOutput(TestInput):
id: str
metadata: str
keys_input = {
"key1": True,
"key2": False
}
keys_output = {
"id": "some_id",
"key1": True,
"key2": False,
"metadata": "some_metadata"
}
test_input = TestInput.parse_obj(keys_input)
test_output = TestOutput.parse_obj(keys_output)
exclude_fields = {"id", "metadata"}
assert test_input.dict() == test_output.dict(exclude=exclude_fields)
assert test_input.json() == test_output.json(exclude=exclude_fields)
Any updates on this? Did you guys found a workaround to repect the field order? I've also run into this problem building a FastAPI application
@xoelop I've used the following as a workaround. Note that I am still using Pydantic v1.8 and haven't tested on more recent versions.
class Foo(BaseModel):
def __init__(self, *args, **kwargs):
# HACK: Because Pydantic does not preserve order for extra parameters (https://github.com/samuelcolvin/pydantic/issues/1234)
# we assign them in order after the class has been created
# HACK #2: To do this, we temporarily set 'allow_mutation' to True, and
# reset it to the value in Config afterwards.
extra_kwargs = {}
for k in list(kwargs):
if k not in self.__fields__:
extra_kwargs[k] = kwargs.pop(k)
super().__init__(*args, **kwargs)
# vvvv Enable mutations vvvv
old_allow_mutation = self.__config__.allow_mutation
self.__config__.allow_mutation = True
for k, v in extra_kwargs.items():
setattr(self, k, v)
self.__config__.allow_mutation = old_allow_mutation
# ^^^^ Reset mutability as before ^^^^
This is fixed in v2 I think. I'll add a unit test to confirm that order is maintained.
I doubt it will get changed in v1, but hopefully v2 will be released this year.