marshmallow Dot in key is not parsed correclty on loads()

Hey, devs! First, thanks for the great project! Really useful and helpful. I found a weird behavior while experimenting with dictionary keys containing dots (".") in it. Here is the example:

>>> from marshmallow import Schema, fields
>>> TestSchema = Schema.from_dict({"something": fields.Str(), "some.thing": fields.Str()})
>>> TestSchema().loads('{"something": "data"}')
{'something': 'data'}
>>> TestSchema().loads('{"some.thing": "data"}')
{'some': {'thing': 'data'}}

Is it a correct behavior? How to work around it?

Oct 05 '23 19:10 dchirikov

I don't see an explicit test for this. @sloria was this intended?

Oct 05 '23 20:10 lafrech

from_dict lets you use field names that would otherwise not be valid class attributes. The observed behavior is consistent with using dotted keys for data_key and attribute.

from marshmallow import Schema, fields


Test = Schema.from_dict({'foo.bar': fields.Str()})
schema = Test()
obj = schema.load({'foo.bar': 'baz'})
print('from_dict', 'load', obj)
data = schema.dump(obj)
print('from_dict', 'dump', data)


class Test(Schema):
    foo_bar = fields.Str(data_key='foo.bar', attribute='foo.bar')
schema = Test()
obj = schema.load({'foo.bar': 'baz'})
print('Schema', 'load', obj)
data = schema.dump(obj)
print('Schema', 'dump', data)

from_dict  load  {'foo': {'bar': 'baz'}}
from_dict  dump  {'foo.bar': 'baz'}

Schema     load  {'foo': {'bar': 'baz'}}
Schema     dump  {'foo.bar': 'baz'}

This behavior is not documented for attribute thought. It appears to be a side effect of performing dotted name resolution in get_value at load time.

A workaround would be to explicitly define an attribute for the field without dots so that no nesting occurs.

from marshmallow import Schema, fields


Test = Schema.from_dict({'foo.bar': fields.Str(attribute='foo_bar')})
schema = Test()
obj = schema.load({'foo.bar': 'baz'})
print('from_dict', 'load', obj)
data = schema.dump(obj)
print('from_dict', 'dump', data)


class Test(Schema):
    foo_bar = fields.Str(data_key='foo.bar', attribute='foo_bar')
schema = Test()
obj = schema.load({'foo.bar': 'baz'})
print('Schema', 'load', obj)
data = schema.dump(obj)
print('Schema', 'dump', data)

from_dict  load  {'foo_bar': 'baz'}
from_dict  dump  {'foo.bar': 'baz'}

Schema     load  {'foo_bar': 'baz'}
Schema     dump  {'foo.bar': 'baz'}

Oct 05 '23 21:10 deckar01

This is an intentional behavior and it is covered with tests for attribute. See #450.

It is tempting to start adding ways to opt out of this behavior, but I don't think it is actually necessary to support arbitrary key structures in deserialized objects. If something is consuming the data and dictating the key structure, it should be consuming dumped data. Otherwise the code can conform to the default output or customize it with enveloping.

We should update the docs for attribute and from_dict to advertise this behavior.

Oct 05 '23 21:10 deckar01

marshmallow marshmallow copied to clipboard

Dot in key is not parsed correclty on loads()

marshmallow
marshmallow copied to clipboard