mashumaro
mashumaro copied to clipboard
Inconsistent checks for invalid value type for str field type
If a class based on DataClassDictMixin has a field with type str it will construct instances from data that contains data of other types for that field, including numbers, lists, and dicts. However fields of other types, eg int, do not accept other non-compatible types. Not sure if this is intentional and I'm missing something here, but it kinda seems like unexpected/undesirable behaviour when you want the input data to be validated.
The following example only throws an error on the very last line:
from dataclasses import dataclass
from mashumaro import DataClassDictMixin
@dataclass
class StrType(DataClassDictMixin):
a: str
StrType.from_dict({'a': 1})
StrType.from_dict({'a': [1, 2]})
StrType.from_dict({'a': {'b': 1}})
@dataclass
class IntType(DataClassDictMixin):
a: int
IntType.from_dict({'a': 'blah'})
There is no strict validation at the moment for the sake of performance. It's not needed in many cases but I'm going to add optional validation. It will be turned on in the field or config options.
Current workaround is to use explicit serialization strategy either in the config or at the field:
def coerce_str(value):
return str(value)
def validate_str(value):
if not isinstance(value, str):
raise ValueError
return value
@dataclass
class StrType(DataClassDictMixin):
a: str
# a: str = field(metadata={"deserialize": validate_str})
class Config:
serialization_strategy = {
str: {
"deserialize": validate_str,
# "deserialize": coerce_str,
},
}
So this library advertising it was faster than cattrs piqued my interest; and I couldn't quite figure out how until I looked at the generated code and noticed it wasn't validating string dict keys. I'm all for filthy optimizations in the name of performance but skipping validation by default isn't one of them ;)