marshmallow icon indicating copy to clipboard operation
marshmallow copied to clipboard

Add a way to trim whitespace characters

Open 245967906 opened this issue 6 years ago • 17 comments

Like the trim_whitespace parameter in the django rest framework.

245967906 avatar Sep 11 '19 11:09 245967906

You could use a custom field.

from marshmallow import fields, Schema


class TrimmedString(fields.String):
    def _deserialize(self, value, *args, **kwargs):
        if hasattr(value, 'strip'):
            value = value.strip()
        return super()._deserialize(value, *args, **kwargs)


class ArtistSchema(Schema):
    name = TrimmedString(required=True)


print(ArtistSchema().load({"name": " David "}))
# {'name': 'David'}

You could also use field composition if you want to trim other field types.

from marshmallow import fields, Schema


class Trim(fields.Field):
    def __init__(self, inner, *args, **kwargs):
        self.inner = inner
        super().__init__(*args, **kwargs)

    def _bind_to_schema(self, field_name, parent):
        super()._bind_to_schema(field_name, parent)
        self.inner._bind_to_schema(field_name, parent)

    def _deserialize(self, value, *args, **kwargs):
        if hasattr(value, 'strip'):
            value = value.strip()
        return self.inner._deserialize(value, *args, **kwargs)

    def _serialize(self, *args, **kwargs):
        return self.inner._serialize(*args, **kwargs)


class ArtistSchema(Schema):
    name = Trim(fields.String(), required=True)
    email = Trim(fields.Email())


print(ArtistSchema().load({"name": " David ", "email": " [email protected] "}))
# {'name': 'David', 'email': '[email protected]'}

That said, I'm going to keep this issue open because this may be a common enough use case to justify adding to marshmallow core. Feedback welcome.

sloria avatar Sep 11 '19 12:09 sloria

Should we add a strip_whitespace parameter to Field? @lafrech @deckar01

sloria avatar Sep 11 '19 12:09 sloria

@sloria thanks for your quick reply, the example you gave above can solve my problem, it's great. but as you said, adding a strip_whitespace parameter might be a better way.

245967906 avatar Sep 11 '19 13:09 245967906

Copying https://github.com/marshmallow-code/marshmallow/pull/1397#issuecomment-531323116 here for discussion:

@sloria hi, I go to see the implementation of DRF and typesystem. summarized as follows:

* DRF's CharField and DatetimeField inherit from the same base class Field, and the trim_whitespace option only works in Charfield.

* In typesystem, Datetime inherits from String, so they can all be affected by trim_whitespace which define in String.

But in marshmallow, things is different from the two implementations above.

If you implement strip_whitespace in Field according to your idea, for the field like Number, providing trim_whitespace parameter is not semantics.

So now it is a reference to DRF that ignores DatetimeField, or let Datetime inherit from String like typesystem? i can't make a choice and I think this may require you to make a decision.

This is a more troublesome issue and may take a lot of your time. Anyway, if you have any decisions or ideas, please tell me, I also hope to help this feature complete as soon as possible.

thank you!

sloria avatar Sep 14 '19 13:09 sloria

strip_whitespace could be applied to any field that can deserialize strings, including Number (" 1 " -> 1). So I still think adding it to Field rather than String makes sense.

sloria avatar Sep 14 '19 17:09 sloria

strip_whitespace could be applied to any field that can deserialize strings, including Number (" 1 " -> 1). So I still think adding it to Field rather than String makes sense.

I don't think this is a convincing reason, because for number type, whether or not we provide the strip_whitespace parameter doesn't make any difference to the result.

>>> float(" 123 ")
123.0
>>>
>>> int(" 123 ")
123

245967906 avatar Sep 15 '19 02:09 245967906

Sure, Python's float and int functions implicitly strip whitespace, so perhaps Number is a bad example. But it is still relevant for any field that deserializes strings, e.g. Boolean.

sloria avatar Sep 15 '19 02:09 sloria

Bool types do have this problem. So is the preferred solution now to deal with them in Field rather than consider changing the base class of some types to String?

245967906 avatar Sep 15 '19 02:09 245967906

I'm not sure. Perhaps we could have a TrimmableMixin for fields that deserialize strings. Or we could use the composable Trim implementation I posted before.

sloria avatar Sep 15 '19 14:09 sloria

Decorator version:

def trim_before(wrapped):
    def wrapper(self, value, *args, **kwargs):
        if hasattr(value, 'strip'):
            value = value.strip()
        return wrapped(self, value, *args, **kwargs)

    return wrapper


from marshmallow import fields

fields.String._deserialize = trim_before(fields.String._deserialize)

ddcatgg avatar Nov 23 '21 08:11 ddcatgg

You could use a custom field.

from marshmallow import fields, Schema


class TrimmedString(fields.String):
    def _deserialize(self, value, *args, **kwargs):
        if hasattr(value, 'strip'):
            value = value.strip()
        return super()._deserialize(value, *args, **kwargs)


class ArtistSchema(Schema):
    name = TrimmedString(required=True)


print(ArtistSchema().load({"name": " David "}))
# {'name': 'David'}

You could also use field composition if you want to trim other field types.

from marshmallow import fields, Schema


class Trim(fields.Field):
    def __init__(self, inner, *args, **kwargs):
        self.inner = inner
        super().__init__(*args, **kwargs)

    def _bind_to_schema(self, field_name, parent):
        super()._bind_to_schema(field_name, parent)
        self.inner._bind_to_schema(field_name, parent)

    def _deserialize(self, value, *args, **kwargs):
        if hasattr(value, 'strip'):
            value = value.strip()
        return self.inner._deserialize(value, *args, **kwargs)

    def _serialize(self, *args, **kwargs):
        return self.inner._serialize(*args, **kwargs)


class ArtistSchema(Schema):
    name = Trim(fields.String(), required=True)
    email = Trim(fields.Email())


print(ArtistSchema().load({"name": " David ", "email": " [email protected] "}))
# {'name': 'David', 'email': '[email protected]'}

That said, I'm going to keep this issue open because this may be a common enough use case to justify adding to marshmallow core. Feedback welcome.

Thanks for the example. Is there any reason to add the condition if hasattr(value, 'strip')? As any string will have the strip attribute

jonbesga avatar Mar 23 '22 08:03 jonbesga

Haven't you had any definition about the implementation of this validation in the lib yet?

JFDValente avatar Feb 08 '23 13:02 JFDValente