marshmallow icon indicating copy to clipboard operation
marshmallow copied to clipboard

unknown parameter passed to load method is not popagated to nested schema field.

Open lmignon opened this issue 6 years ago • 7 comments

Current behavior

When we deserialize a data structure containing a nested data structure by using the load method, the unknown parameter is without effect on the nested data structure and a error is raised if an unknown field is present into the nested data structure

Expected behavior:

unknown parameter into load method is propagated to the the load of all nested data structure.

Functional context:

when processing a REST call in a server, it is often useful to be able to ignore unknown fields in the json provided during the deserialization process. This does not mean that we want these fields to be ignored in all contexts. The use of the unknown parameter of the load method allows this use case. Unfortunately, it does not apply to the nested field.

lmignon avatar Oct 18 '19 06:10 lmignon

Discussed in #980.

We decided that Meta option should not propagate, but someone then argued that load argument could propagate.

lafrech avatar Oct 18 '19 07:10 lafrech

@lafrech Thank you for the link. In this case only the load argument is propagated. I fully agree that the Meta option should not be propagated since it 's possible to specify the unknown parameter at the NestedField declaration level. But IMO, since the unknown parameter of the load method can be used to force a specific behavior limited to the processing of the method, it seems logic that this parameter is propagated to the loading of nested data structure.

lmignon avatar Oct 18 '19 07:10 lmignon

I think this proposed change would make it too easy for developers to implicitly opt into undesired behavior. I don't think there is any precedent for method args behaving this much differently than their Meta counterparts, which would make it unexpected behavior for most developers.

If you want to change the default globally, you can with minimal code. See https://github.com/marshmallow-code/marshmallow/issues/1367. You could detect the context and set the global default before the schemas are declared.

If you need to apply this transformation on certain schemas, I would recommend solving this use case with a utility method in your project that recursively walks through the schema fields and sets the unknown attribute on Nested schemas.

def set_unknown_all(schema, unknown):
    schema.unknown = unknown
    for field in schema.fields.values():
        if isinstance(field, Nested):
            set_unknown_all(field.schema, unknown)
    return schema

deckar01 avatar Nov 25 '19 19:11 deckar01

@deckar01 When you call the load method you already have the possibility to provide your specific 'unknown' paramater that takes precedence on the one defined into your Meta counterpart. That give you the ability to control this behavior at load time without having to modify your schema. This behavior is useful when you want to validate/parse json provided by external systems without having to define the full schema. At the same time it's not because I want to ignore these unknown fields when loaded from external systems that I want to ignore these when loaded from within my code. The proposed change implemented in #1429 is to propagate the 'unknown parameter' that you can already provide to the load method to the loading of nested schemas.

Your proposed utility method will alter the schema definition; that's exactly what I want to avoid.

lmignon avatar Nov 25 '19 22:11 lmignon

If you set the Meta unknown option or unknown schema constructor argument, it doesn't recursively propagate this property. It would not be consistent or intuitive behavior for it to start propagating recursively when it is a load argument. This feature would be a foot-gun for unwitting developers. This feature has security implications which need to be considered.

Overriding unknown in bulk is not a use case we necessarily want to dedicate API surface area making easier. Two other discussions on the topic came to this conclusion. See #1367 and #980.

Another consideration is that this proposal would be a breaking change. It could not be released until the next major version. It should be possible to satisfy your use case with a small amount of utility code now.

Your proposed utility method will alter the schema definition; that's exactly what I want to avoid.

If you pass it a schema instance it should be fine.

schema = Schema()
permissive_schema = set_unknown_all(Schema(), INCLUDE)

deckar01 avatar Nov 25 '19 23:11 deckar01

Two other discussions on the topic came to this conclusion. See #1367 and #980.

I don't see any conclusion on this topic in the context of the load method. The discussion is still open with pros and cons and it seems that I'm not the only one thinking that it should be possible to propagate this parameter when loading...

This feature would be a foot-gun for unwitting developers.

Unwitting developers are foot-guns. These developers don't use Marshmallow at all... :smirk:

lmignon avatar Nov 26 '19 07:11 lmignon

If you pass it a schema instance it should be fine.

It's not fine since the value of the 'unknown' parameter specified at class level definition is lost in the instance. (I can transform this method to become a decorator ...)

lmignon avatar Nov 26 '19 07:11 lmignon