mashumaro icon indicating copy to clipboard operation
mashumaro copied to clipboard

Inconsistent checks for invalid value type for str field type

Open OliverColeman opened this issue 3 years ago • 3 comments

If a class based on DataClassDictMixin has a field with type str it will construct instances from data that contains data of other types for that field, including numbers, lists, and dicts. However fields of other types, eg int, do not accept other non-compatible types. Not sure if this is intentional and I'm missing something here, but it kinda seems like unexpected/undesirable behaviour when you want the input data to be validated.

The following example only throws an error on the very last line:

from dataclasses import dataclass
from mashumaro import DataClassDictMixin


@dataclass
class StrType(DataClassDictMixin):
    a: str

StrType.from_dict({'a': 1})
StrType.from_dict({'a': [1, 2]})
StrType.from_dict({'a': {'b': 1}})


@dataclass
class IntType(DataClassDictMixin):
    a: int

IntType.from_dict({'a': 'blah'})

OliverColeman avatar Feb 24 '21 22:02 OliverColeman

There is no strict validation at the moment for the sake of performance. It's not needed in many cases but I'm going to add optional validation. It will be turned on in the field or config options.

Fatal1ty avatar Feb 25 '21 07:02 Fatal1ty

Current workaround is to use explicit serialization strategy either in the config or at the field:

def coerce_str(value):
    return str(value)


def validate_str(value):
    if not isinstance(value, str):
        raise ValueError
    return value


@dataclass
class StrType(DataClassDictMixin):
    a: str
    # a: str = field(metadata={"deserialize": validate_str})

    class Config:
        serialization_strategy = {
            str: {
                "deserialize": validate_str,
                # "deserialize": coerce_str,
            },
        }

Fatal1ty avatar Mar 26 '23 10:03 Fatal1ty

So this library advertising it was faster than cattrs piqued my interest; and I couldn't quite figure out how until I looked at the generated code and noticed it wasn't validating string dict keys. I'm all for filthy optimizations in the name of performance but skipping validation by default isn't one of them ;)

Tinche avatar Jul 25 '23 01:07 Tinche