marshmallow
marshmallow copied to clipboard
Object to pass to Nested as missing
In deserialization, It seems like the missing
value is what is returned if there is no value as opposed to the value that gets passed to the nested Schema. Is there anyway to have "default" value for the nested schema?
e.g:
class A(Schema):
a = fields.String()
class B(Schema):
b = fields.Nested(A, many=True, missing=[])
class C(Schema):
c = fields.Nested(B, missing={})
C().load({}) # returns {'c': {}}
I would like it to return {'c': {'b': []}}
, i.e have it run the nested schema passing an empty object and then letting the nested schema apply its own missing logic.
This could be a downside of https://github.com/marshmallow-code/marshmallow/issues/378 / https://github.com/marshmallow-code/marshmallow/pull/756.
I didn't try, but AFAIU when default value was passed in serialized form, it was deserialized, so in this case the Nested
field would deserialize {}
and apply its default values.
Since the change in #756, the deserialization does not happen.
I understand the need, but I don't see a simple way to get around this.
Any reason not to change missing
such that it get's passed to the respective fields rather than being returned by them? Or add a new missing concept that does that?
Any reason not to change missing such that it get's passed to the respective fields rather than being returned by them?
AFAIU, "passed to the respective fields" means "passed as serialized data to deserialize", which is the old behaviour, before #756. Isn't it?
BTW, I suppose we had the symmetric issue with default
before that change. It would be interesting to check.
Or add a new missing concept that does that?
A reason not to do it could be "complex API". Unless we find an elegant way to do that without complexifying things too much.
I'm really sad that this happened. I was using this a lot to initialize nested fields. Is there really no work around to get this to work again?
edited to add: I do think this is a very big change in behaviour, and it would be good to add/explain this consequence of the #756 to the changelog.
This is odd. I would imagine that setting defaults on nested fields is not an uncommon use case.
Here is a quick and dirty work around:
class C(Schema):
c = fields.Nested(B, missing=B().load({}))
If a callable value for missing
had a way to access the nested schema instance this would be a more robust option.
class C(Schema):
c = fields.Nested(B, missing=lambda self: self.schema.load({}))
This solution works most of the time, but it doesn't propagate the context. Here's a solution @Kareeeeem and I came up with:
class A(Schema):
x = fields.String(missing='x')
y = fields.String(missing='y')
z = fields.String()
class B(Schema):
a = fields.Nested(A, missing=dict)
b = fields.Nested(A, missing=lambda: {'y': 'not y'})
c = fields.Nested(A)
@pre_load
def load_missing_nested(self, data):
for fieldname, field in self.fields.items():
if (fieldname not in data and isinstance(field, fields.Nested) and
callable(field.missing)):
data[fieldname] = field.schema.load(field.missing())
return data
B().load({})
{
'a': {'x': 'x', 'y': 'y'},
'b': {'x': 'x', 'y': 'not y'},
}
BTW, I suppose we had the symmetric issue with default before that change. It would be interesting to check.
Confirmed.
This code, before #756 (I actually tried on 2.x-line), prints {'c': {}}
from marshmallow import Schema, fields
class A(Schema):
a = fields.String()
class B(Schema):
b = fields.Nested(A, many=True, default=[])
class C(Schema):
c = fields.Nested(B, default={})
print(C().dump({}))
So the issue is not new. It just impacts missing
while it used to impact default
. I guess default
is used less often, but my point is that even reverting #756 wouldn't really be a satisfying answer.
This issue looks like it'll need deeper investigation, but I'd really hate to delay 3.0 any further. Since there are existing workarounds posted above, how do we feel about deferring this for post-3.0.0? @lafrech @deckar01
In the end we had to go with a different workaround than the one above, I forgot why exactly. The most practical thing was just to create our own Nested field and use that everywhere:
class Nested(fields.Nested):
"""
Field that will fill in nested before loading so nested missing fields will
be initialized.
"""
def deserialize(self, value, attr=None, data=None, **kwargs):
self._validate_missing(value)
if value is missing_:
_miss = self.missing
value = _miss() if callable(_miss) else _miss
return super().deserialize(value, attr, data, **kwargs)
Due to the former (MA2) behaviour, when people use {}
as nested_field.missing
value, they may not mean "{}
" but "this is an empty value, please give me the default load value of the schema, that is the value the schema outputs when loading {}
".
This is unfortunately not the new semantics for missing
. The default value should be returned as is, therefore {}
means {}
.
OTOH, there seems to be a demand case for a feature allowing to default to the schema default load value.
We could add another parameter to allow passing a missing
value in serialized form. I'd rather avoid that double API exposure but if it's just a shortcut for the workaround above, it could be nice to provide it. At least, it is worth investigating. That would be a non-breaking change.
There may be other ways to provide this in a non-breaking manner.
Note that default
now acts like missing
used to, so when passing {}
as default
, Nested
dumps the default schema dump (including field defaults), not an empty schema. That's consistent with the fact that the value is expressed in object form, not serialized form. There is no way to specify an empty default if the schema has field defaults.
Overall, I don't object to postponing this to 3.x. But I agree this use case is legit and it would be a nice feature to have. And I understand it can be seen as a regression when coming from MA2.
This solution works most of the time, but it doesn't propagate the context. Here's a solution @Kareeeeem and I came up with:
class A(Schema): x = fields.String(missing='x') y = fields.String(missing='y') z = fields.String() class B(Schema): a = fields.Nested(A, missing=dict) b = fields.Nested(A, missing=lambda: {'y': 'not y'}) c = fields.Nested(A) @pre_load def load_missing_nested(self, data): for fieldname, field in self.fields.items(): if (fieldname not in data and isinstance(field, fields.Nested) and callable(field.missing)): data[fieldname] = field.schema.load(field.missing()) return data
B().load({}) { 'a': {'x': 'x', 'y': 'y'}, 'b': {'x': 'x', 'y': 'not y'}, }
Following @RosanneZe approach seems to work in general but initializing a DateTime field somehow results in a weird behavior. I tried to create a small example here:
from marshmallow import Schema, fields, pre_load
import datetime
class NestedDateSchema(Schema):
date_time = fields.DateTime(missing=lambda: datetime.datetime.now().isoformat())
operator = fields.String(missing=">=")
class ParentSchema(Schema):
name = fields.String(missing="TEST")
time = fields.Nested(NestedDateSchema, missing=dict)
@pre_load
def load_missing_nested(self, data, **kwargs):
for fieldname, field in self.fields.items():
if (fieldname not in data and isinstance(field, fields.Nested) and callable(field.missing)):
data[fieldname] = field.schema.load(field.missing())
return data
# [...]
ParentSchema().load({'name': 'case1 - date_time="2022-03-01T11:05:52.142158"', 'time': {'operator': '<='}})
# date_time looks like a string
# vs
ParentSchema().load({'name': 'case2 - date_time=datetime.datetime(2022, 3, 1, 11, 28, 35, 80158)'})
# date_time is a datetime instance
Not using the lambda
expression seems to prevent this weird behavior but results in always the same time from first import. Any ideas how this could be solved?
In my actual use case I get the following error: "'str' object has no attribute 'isoformat'" which may be related to the shown circumstances.