marshmallow icon indicating copy to clipboard operation
marshmallow copied to clipboard

Add fields.Iterable as a base class for collection field types

Open jtrakk opened this issue 6 years ago • 4 comments

I'd like to deserialize into some types that aren't json serializable, for example a frozenset of integers, frozenset({1,2,3}).

I can think of a few options in the current API:

  • define a custom FrozensetField type and define its _serialize and _deserialize methods,
  • use fields.Function(deserialize=frozenset, serialize=list),
  • define a custom replacement for the json module.

Is fields.Function the recommended strategy here? I'm unsure because

  • it doesn't take a cls_or_instance parameter like fields.List does, so it's not obvious how to make a set of some other objects. For example, how do I define a frozenset of fields.Nested(OtherSchema)?
  • it makes Function the main focus of attention, instead of the data.

Alternatively, an API that comes to mind is something like

class SomeSchema(marshmallow.Schema):
    x = fields.Integer()
    y = fields.String()

class MoreSchema(marshmallow.Schema):
    values = fields.List(fields.Nested(SomeSchema), deserialize=frozenset)
    data = fields.Mapping(
        keys=fields.String(), 
        values=fields.Integer(), 
        deserialize=collections.OrderedDict
    )

Like fields.Function, these might also take a serialize argument, defaulting to list and dict, respectively.

This would generalize the existing idea of Mapping.mapping_type , and would make it an instance attribute rather than a class attribute.

What is the recommended approach for deserializing to Frozenset[int] or Frozenset[MyObject]? Do you think an API like the idea above would be useful?

jtrakk avatar Jun 19 '19 09:06 jtrakk

I like the idea of a generic iterable patterned after Mapping that lets you specify your own container. I would probably keep the same class attribute API though. Defining containers on fields ad hoc is a little messy. Declaring custom fields maps better to what is happening under the hood and avoids code that has to run even if the feature isn't used.

deckar01 avatar Jun 19 '19 15:06 deckar01

I'm optimistic that parametrized iterable types can be pretty simple and efficient. An implementation copies nearly all of the fields.List code. The main differences are quite minor:

  • for serialization, just replace the list comprehension with a generator expression wrapped in a constructor
  • for deserialization, build the list as normal, and if the desired type is not list, construct it from the list.
import marshmallow
from marshmallow import fields
from marshmallow.fields import Field



class Iterable(Field):
    """An iterable field of specified type, composed with another `Field` class or
    instance.

    Example: ::

        numbers = fields.Iterable(fields.Float(), deserialize_to_type=frozenset)

    :param Field cls_or_instance: A field class or instance.
    :param: deserialize_to_type: The type to instantiate when loading. Defaults to ``list``.
    :param: serialize_to_type: The type to instantiate when dumping. Defaults to ``list``.
    :param bool default: Default value for serialization.
    :param kwargs: The same keyword arguments that :class:`Field` receives.

    .. versionchanged:: 2.0.0
        The ``allow_none`` parameter now applies to deserialization and
        has the same semantics as the other fields.
    """

    default_error_messages = {'invalid': 'Not a valid list.'}

    def __init__(self, cls_or_instance, deserialize_to_type=list, serialize_to_type=list, **kwargs):
        super().__init__(**kwargs)
        self.deserialize_to_type = deserialize_to_type
        self.serialize_to_type = serialize_to_type
        try:
            self.container = resolve_field_instance(cls_or_instance)
        except FieldInstanceResolutionError:
            raise ValueError(
                'The list elements must be a subclass or instance of '
                'marshmallow.base.FieldABC.',
            )

    def _bind_to_schema(self, field_name, schema):
        super()._bind_to_schema(field_name, schema)
        self.container = copy.deepcopy(self.container)
        self.container.parent = self
        self.container.name = field_name

    def _serialize(self, value, attr, obj, **kwargs):
        if value is None:
            return None
        if utils.is_collection(value):
            return self.serialize_to_type(
                self.container._serialize(each, attr, obj, **kwargs) for each in value
            )
        return self.serialize_to_type([self.container._serialize(value, attr, obj, **kwargs)])

    def _deserialize(self, value, attr, data, **kwargs):
        if not utils.is_collection(value):
            self.fail('invalid')

        result = []
        errors = {}
        for idx, each in enumerate(value):
            try:
                result.append(self.container.deserialize(each))
            except ValidationError as error:
                if error.valid_data is not None:
                    result.append(error.valid_data)
                errors.update({idx: error.messages})
        if errors:
            raise ValidationError(errors, valid_data=result)

        if self.deserialize_to_type == list:
            return result
        return self.deserialize_to_type(result)

jtrakk avatar Jun 19 '19 21:06 jtrakk

This functionality can be achieved with a custom field without changing the core with minimal code:

from marshmallow import fields, Schema

class FrozenSet(fields.List):
    def _deserialize(self, *args, **kwargs):
        return frozenset(super()._deserialize(*args, **kwargs))

class Test(Schema):
    foo = FrozenSet(fields.Str)

schema = Test()
data = schema.load({'foo': ['a', 'b', 'c']}).data
# {'foo': frozenset({'c', 'b', 'a'})}
schema.load(data)
# {'foo': ['c', 'b', 'a']}

If it needed to have convenience API I would recommend keeping the Iterable interface consistent with Mapping. This pattern encourages creating custom field classes for new types and minimizes the surface area of the constructor.

class FrozenSet(fields.Iterable):
    iterable_type = frozenset

deckar01 avatar Jun 19 '19 21:06 deckar01

Ok, sounds good :-)

jtrakk avatar Jun 20 '19 00:06 jtrakk