BaseModel.__iter__ may be a footgun
Initial Checks
- [X] I have searched Google & GitHub for similar requests and couldn't find anything
- [X] I have read and followed the docs and still think this feature is missing
Description
See #6977
BaseModel.__iter__ exists (according to its doc) to allow dict(model). This is a handy trick but it does expose users to an accidental footgun if they're not expecting BaseModel types to behave in this way.
Perhaps we should consider deprecating / removing in V3, with the deprecation message maybe advising users to use model.__dict__?
Affected Components
- [ ] Compatibility between releases
- [ ] Data validation/parsing
- [ ] Data serialization -
.model_dump()and.model_dump_json() - [ ] JSON Schema
- [ ] Dataclasses
- [ ] Model Config
- [ ] Field Types - adding or changing a particular data type
- [ ] Function validation decorator
- [ ] Generic Models
- [X] Other Model behaviour -
model_construct(), pickling, private attributes, ORM mode - [ ] Plugins and integration with other tools - mypy, FastAPI, python-devtools, Hypothesis, VS Code, PyCharm, etc.
Selected Assignee: @hramezani
@davidhewitt Did we think about this?
No, I think this needs a V3 label of some kind, and we should leave it here to discuss in V3 sometime in the future.
Code that implements the Mapping interface breaks from v1→v2 because:
- Pydantic warns if you don't inherit first from BaseModel: don't do
MySubClass(MyMapping, BaseModel) - Pydantic implements
__iter__to return items, which is inconsistent and breaks Mapping. Mapping expects to iterate over keys and then look up the keys with__getitem__, but it receives item tuples instead, leading to KeyError.
Note also that the current "syntactic sugar" gives the nice feeling like iterating over a dict, but surprisingly that should give actually: list(iter(dict(key="value"))) == ["key"].
Possibly it is better to have dedicated named iterator methods instead of the current __iter__ implementation at the top-level of the object.
By being consistent with Python's abstract base classes, Pydantic would not only prevent errors, but could still offer the same convenience: dict(**model)
Hi, so I have actually today shot myself in the foot (didn't hurt much) with this unexpected behavior of the BaseModel iter implementation.
I have a class like this
class FrequencyRange(BaseModel):
frequency_low_hz: float = Field(default=0, gt=0)
frequency_high_hz: float = Field(default=0, gt=0)
And it's useful for me that it's a Pydantic model because I want to have value restrictions, value checks etc. But sometimes I want it to behave as a simple tuple like (0, 100), to pass it to 3rd party apps that sometimes want tuples. But instead when I loop over it and print the values I get
('frequency_low_hz', 0.5)
('frequency_high_hz', 12.0)
So intead of 1 tuple I get 2! Of course I can overwrite the __iter__ method, but then I'm breaking dict(model) and it may become relevant somewhere. So instead I wrote a method as_tuple(), which is fine but defining __iter__ feels cleaner imho.
I too would favor @aeisenbarth solution of changing the default behavior from __iter__ to a separate specialized method.
Pydantic models are rather comparable to key-value stores (mappings) than sequences. (And I would not rely on a stable order, although Python 3 dicts and Pydantic fields actually have). Iterating a Python mapping yields the keys, not the values (and not key-value tuples, as Pydantic incorrectly does).
So what you pass to other apps must be more tuple-like.
- I think what you are looking for are maybe named tuples. If you define
class FrequencyRange(NamedTuple), Pydantic models can have it as field type and serialize/deserialize it, except that then FrequencyRange alone doesn't have all the Pydantic features. - If you want a Pydantic model, you could implement a custom base model class
SequenceLikeBaseModelwhere you override__iter__,__getitem__(int),__len__. - Or you pass
frequency.model_dump().values()to other apps.
Hi @aeisenbarth , thanks for the prompt response and your suggestions, I will keep those possibilities in mind going forward with Pydantic (still learning). For now my problem is solved simply having a way to return it as a tuple. I did indeed try Dataclasses and NamedTuples but for some specific functionality BaseModel was simply more convenient and not much of an issue to circumvent the unexpected __iter__ behavior.
I just thought, as I found this issue while googling about this, that I would present a case in which this issue could come up, and I liked your idea of freeing up the __iter__ method for Pydantic's BaseModel in general, not necessarily just for my use-case. I believe it's just better to have that method available if it can be done without breaking any feature, just makes Pydantic model's even more flexible imho.
Again, thanks a lot for taking the time to answer 😃