Add fields.Iterable as a base class for collection field types
I'd like to deserialize into some types that aren't json serializable, for example a frozenset of integers, frozenset({1,2,3}).
I can think of a few options in the current API:
- define a custom
FrozensetFieldtype and define its_serializeand_deserializemethods, - use
fields.Function(deserialize=frozenset, serialize=list), - define a custom replacement for the json module.
Is fields.Function the recommended strategy here? I'm unsure because
- it doesn't take a
cls_or_instanceparameter likefields.Listdoes, so it's not obvious how to make asetof some other objects. For example, how do I define a frozenset offields.Nested(OtherSchema)? - it makes
Functionthe main focus of attention, instead of the data.
Alternatively, an API that comes to mind is something like
class SomeSchema(marshmallow.Schema):
x = fields.Integer()
y = fields.String()
class MoreSchema(marshmallow.Schema):
values = fields.List(fields.Nested(SomeSchema), deserialize=frozenset)
data = fields.Mapping(
keys=fields.String(),
values=fields.Integer(),
deserialize=collections.OrderedDict
)
Like fields.Function, these might also take a serialize argument, defaulting to list and dict, respectively.
This would generalize the existing idea of Mapping.mapping_type , and would make it an instance attribute rather than a class attribute.
What is the recommended approach for deserializing to Frozenset[int] or Frozenset[MyObject]? Do you think an API like the idea above would be useful?
I like the idea of a generic iterable patterned after Mapping that lets you specify your own container. I would probably keep the same class attribute API though. Defining containers on fields ad hoc is a little messy. Declaring custom fields maps better to what is happening under the hood and avoids code that has to run even if the feature isn't used.
I'm optimistic that parametrized iterable types can be pretty simple and efficient. An implementation copies nearly all of the fields.List code. The main differences are quite minor:
- for serialization, just replace the list comprehension with a generator expression wrapped in a constructor
- for deserialization, build the list as normal, and if the desired type is not
list, construct it from the list.
import marshmallow
from marshmallow import fields
from marshmallow.fields import Field
class Iterable(Field):
"""An iterable field of specified type, composed with another `Field` class or
instance.
Example: ::
numbers = fields.Iterable(fields.Float(), deserialize_to_type=frozenset)
:param Field cls_or_instance: A field class or instance.
:param: deserialize_to_type: The type to instantiate when loading. Defaults to ``list``.
:param: serialize_to_type: The type to instantiate when dumping. Defaults to ``list``.
:param bool default: Default value for serialization.
:param kwargs: The same keyword arguments that :class:`Field` receives.
.. versionchanged:: 2.0.0
The ``allow_none`` parameter now applies to deserialization and
has the same semantics as the other fields.
"""
default_error_messages = {'invalid': 'Not a valid list.'}
def __init__(self, cls_or_instance, deserialize_to_type=list, serialize_to_type=list, **kwargs):
super().__init__(**kwargs)
self.deserialize_to_type = deserialize_to_type
self.serialize_to_type = serialize_to_type
try:
self.container = resolve_field_instance(cls_or_instance)
except FieldInstanceResolutionError:
raise ValueError(
'The list elements must be a subclass or instance of '
'marshmallow.base.FieldABC.',
)
def _bind_to_schema(self, field_name, schema):
super()._bind_to_schema(field_name, schema)
self.container = copy.deepcopy(self.container)
self.container.parent = self
self.container.name = field_name
def _serialize(self, value, attr, obj, **kwargs):
if value is None:
return None
if utils.is_collection(value):
return self.serialize_to_type(
self.container._serialize(each, attr, obj, **kwargs) for each in value
)
return self.serialize_to_type([self.container._serialize(value, attr, obj, **kwargs)])
def _deserialize(self, value, attr, data, **kwargs):
if not utils.is_collection(value):
self.fail('invalid')
result = []
errors = {}
for idx, each in enumerate(value):
try:
result.append(self.container.deserialize(each))
except ValidationError as error:
if error.valid_data is not None:
result.append(error.valid_data)
errors.update({idx: error.messages})
if errors:
raise ValidationError(errors, valid_data=result)
if self.deserialize_to_type == list:
return result
return self.deserialize_to_type(result)
This functionality can be achieved with a custom field without changing the core with minimal code:
from marshmallow import fields, Schema
class FrozenSet(fields.List):
def _deserialize(self, *args, **kwargs):
return frozenset(super()._deserialize(*args, **kwargs))
class Test(Schema):
foo = FrozenSet(fields.Str)
schema = Test()
data = schema.load({'foo': ['a', 'b', 'c']}).data
# {'foo': frozenset({'c', 'b', 'a'})}
schema.load(data)
# {'foo': ['c', 'b', 'a']}
If it needed to have convenience API I would recommend keeping the Iterable interface consistent with Mapping. This pattern encourages creating custom field classes for new types and minimizes the surface area of the constructor.
class FrozenSet(fields.Iterable):
iterable_type = frozenset
Ok, sounds good :-)