marshmallow
marshmallow copied to clipboard
Use OrderedSet as default set_class
Use OrderedSet
as default set_class
.
Closes #1744.
~We could do that already but I'd rather wait until we drop Python 3.6 so that the change is easier to document: all users will be affected.~
No need to change ordered
at this stage. In marshmallow 4 we can do the same with dict_class and let users override if they really want an OrderedDict
.
This change generates mypy issues. I'd be happy to get help on this.
I can't seem to find a way to silence mypy, here. Help welcome.
Mypy error for those curious:
src/marshmallow/schema.py:919: error: Incompatible types in assignment (expression has type "OrderedSet", variable has type "Optional[Union[Sequence[str], Set[str]]]")
src/marshmallow/schema.py:924: error: Incompatible types in assignment (expression has type "OrderedSet", variable has type "Set[Union[Any, str]]")
src/marshmallow/schema.py:972: error: Incompatible types in assignment (expression has type "AbstractSet[Any]", variable has type "OrderedSet")
The first error is because self.only
is configured to be StrSequenceOrSet | set_class | None
so when you assign an instance variable to it, it's no longer valid as that type, as StrSequenceOrSet
is defined as StrSequenceOrSet = typing.Union[typing.Sequence[str], typing.Set[str]]
which doesn't match the MutableSet being subclassed (set vs. mutable set being the key error here).
Updating StrSequenceOrSet
to the following fixes the first issue:
StrSequenceOrSet = typing.Union[
typing.Sequence[str], typing.Set[str], typing.MutableSet[str]
]
The second one is similar in nature, self.exclude
is inferred as Set[Any]
because of:
self.exclude = set(self.opts.exclude) | set(exclude)
To address this, you need to allow mutable sets because that's what the OrderedSet is subclassing.
self.exclude: typing.Set[typing.Any] | typing.MutableSet[typing.Any] = set(
self.opts.exclude
) | set(exclude)
Finally, to address the last issue you need to tell Python that the field_names
is ok being less specific (an AbstractSet rather than either a Mutable or immutable set:
field_names: typing.AbstractSet[typing.Any] = self.set_class(self.only)
With that, you'll pass all tests. Though, the OrderedSet is still only partially defined as it's missing the generic parameters and whatnot
I should note, per https://docs.python.org/3/library/typing.html typing.AbstractSet
is preferred over typing.Set
when possible:
To annotate arguments it is preferred to use an abstract collection type such as AbstractSet.
This is because AbstractSet is the base collection object and doesn't limit you to a specific type of set. If you need more specific behaviors though, that's when reaching for MutableSet or Set can be helpful. This aligns with how it's a best practice not to accept things like List[str]
or Dict[str, str]
as parameters, instead preferring the protocols like Collection[str]
and Mapping[str, str]
which allow for various types to satisfy their constraints
https://docs.python.org/3/library/collections.abc.html#collections-abstract-base-classes can help understand the hierarchy of base collection types, https://docs.python.org/3/library/typing.html#typing.AbstractSet recommends collections.abc.Set
as of 3.9, to connect that back to it
EDIT: connected how AbstractSet relates here
This is a rough re-write of OrderedSet with type information. I didn't make a PR as I'm not confident enough in how map and end are being used to ensure that the types are correct. The use of lists instead of tuples makes typing a little more difficult as well, because we can't type locations in a list while we can with a tuple.
# OrderedSet
# Copyright (c) 2009 Raymond Hettinger
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation files
# (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge,
# publish, distribute, sublicense, and/or sell copies of the Software,
# and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
# OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
# WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
# OTHER DEALINGS IN THE SOFTWARE.
from collections.abc import MutableSet
from typing import Optional, Generator, Iterable, TypeVar
_Key = TypeVar("_Key")
class OrderedSet(MutableSet[_Key]):
def __init__(self, iterable: Optional[Iterable[_Key]] = None) -> None:
self.end = end = [] # type: ignore
end += [None, end, end] # sentinel node for doubly linked list
self.map = {} # key --> [key, prev, next] # type: ignore
if iterable is not None:
self |= set(iterable)
def __len__(self) -> int:
return len(self.map)
def __contains__(self, key: object) -> bool:
return key in self.map
def add(self, key: _Key) -> None:
if key not in self.map:
end = self.end
curr = end[1]
curr[2] = end[1] = self.map[key] = [key, curr, end]
def discard(self, key: _Key) -> None:
if key in self.map:
key, prev, next = self.map.pop(key)
prev[2] = next
next[1] = prev
def __iter__(self) -> Generator[_Key, None, None]:
end = self.end
curr = end[2]
while curr is not end:
yield curr[0]
curr = curr[2]
def __reversed__(self) -> Generator[_Key, None, None]:
end = self.end
curr = end[1]
while curr is not end:
yield curr[0]
curr = curr[1]
def pop(self, last: bool = True) -> _Key:
if not self:
raise KeyError("set is empty")
key: _Key = self.end[1][0] if last else self.end[2][0]
self.discard(key)
return key
def __repr__(self) -> str:
if not self:
return f"{self.__class__.__name__}()"
return f"{self.__class__.__name__}({list(self)!r})"
def __eq__(self, other: object) -> bool:
if isinstance(other, OrderedSet):
return len(self) == len(other) and list(self) == list(other)
if isinstance(other, Iterable):
return set(self) == set(other)
return False
if __name__ == "__main__":
s = OrderedSet("abracadaba")
t = OrderedSet("simsalabim")
print(s | t)
print(s & t)
print(s - t)
Hope this helps :)
Thank you so much for looking into this!
Regarding the OrderedSet type hints, feel free to send a PR even as draft, if only for history. Also, you might want to have a look at ordered-set lib, see how they did id: https://github.com/rspeer/ordered-set/blob/master/ordered_set/init.py (and perhaps this SO question: https://stackoverflow.com/questions/52569356/provide-orderedsetint-like-types-using-stub-file-without-modifying-ordered-set).
Thanks, submitted a few PR's to her repo to address some incorrect type usage (specifically, incorrect usage of Any
). If you're curious, this comment explains the difference between using Any
and object
as a parameter:
https://github.com/python/typeshed/pull/8469#issuecomment-1203116838
I think this is ready to go.
I removed the section about ordered
from the quickstart because it is outdated and the option is now only required for a very specific purpose: using an OrderedDict
for output, which doesn't need to be documented in a quickstart.
Not sure how and where to document ordered
, now.
Does this change warrant an "upgrading to ..." section?
TODO:
- [x] Add test showing how to specify
OderedDict
as a dict class as class attribute. - [ ] Deprecate
ordered
- [ ] Deprecate
dict_class
argument ofget_declared_fields
test_options.py
is quite focused on the ordered
option and deprecating it would require changes there. Besides, those tests would need to be kept to ensure they don't break until 4.0, and I'm not sure how to silent the deprecation warnings that would be raised.
I don't want to engage in this right now. We can deal with this later. In fact, we can even skip the deprecation phase and remove the parameter and those tests in 4.0.
@sloria I shall merge/release this soonish. Ping me if you want time to review.