marshmallow icon indicating copy to clipboard operation
marshmallow copied to clipboard

Context is not thread-safe (at least) for nested scheme

Open hstadler opened this issue 4 years ago • 5 comments

I'm developing a web application with Flask, having many marhsmallow schema classes with nested schemes. I also make use of marshmallows context. Sometimes (e.g. first run of the application after a reboot) I noticed unexpected serialization results. It turned out that the context in one of the nested schemas differed from it's parent.

A simplified example, which shows the race condition for concurrent requests with a nested scheme:

from datetime import datetime
from flask import Flask
from marshmallow import Schema, fields, post_dump
import time
app = Flask(__name__)


class TestNestSchema(Schema):
    name = fields.String()

    @post_dump()
    def context_dump(self, data, **kwargs):
        data['before'] = self.context.get('test')
        # Do some work
        time.sleep(5)
        data['after'] = self.context.get('test')
        return data


class TestSchema(Schema):
    name = fields.String()
    nested = fields.Nested(TestNestSchema())


class TestNestedClass:
    name = 'TestNestedClass'


class TestClass:
    name = 'TestClass'
    nested = TestNestedClass()


@app.route('/')
def test_endpoint():
    test_schema = TestSchema(context={'test': str(datetime.now())})
    dump = test_schema.dump(TestClass())
    return {'data': dump}


if __name__ == "__main__":
    app.run()

When troubleshooting, I also took a brief look at the marhsmallow code and got the feeling that it wasn't just the context that might cause such problems. Is there a way to use marhsmallow thread-safe?

hstadler avatar Jun 19 '20 13:06 hstadler

@hstadler, I'm not able to reproduce the race-condition. Could you please provide a more complete example?

AndreiPashkin avatar Feb 18 '21 11:02 AndreiPashkin

Nevermind. @hstadler, race-condition in your case if easily fixable by changing nested = fields.Nested(TestNestSchema()) to nested = fields.Nested(TestNestSchema).

AndreiPashkin avatar Feb 18 '21 12:02 AndreiPashkin

Indeed, using a class rather than an instance should be safe. But it has the limitation that it doesn't allow the user to parametrize the nested schema with modifiers by instantiating it in the schema definition.

When using an instance, it seems the context is not replaced but updated.

https://github.com/marshmallow-code/marshmallow/blob/7f7e13162b074f7c939a5194cbf6083e6d2a0093/src/marshmallow/fields.py#L513-L515

    def test_context_thread_safe(self):
        class A(Schema):
            a = fields.Integer()

        class B(Schema):
            n = fields.Nested(A())  # <-- instance

        sb = B()
        sb.context = {"test_1": 1}
        sb.dump({"n": {"a": 1}})
        sb2 = B()
        sb2.context = {"test_2": 1}
        sb2.dump({"n": {"a": 1}})

In the latter case, the context is {"test_1": 1, "test_2": 1}.

I don't see the reason for the update, here. I'd change this to

            if isinstance(nested, SchemaABC):
                self._schema = copy.copy(nested)
                self._schema.context = context

I think this qualifies as a bug and could be fixed by the change above. IIUC, it only affects people using nested schema instances and contexts with different keys, such that the update differs from the assignment. (E.g. passing, {"user_id": xxx} should be safe because the next context overrides the old one.)

@sloria @deckar01 WDYT? Bug? CVE?

I'm proposing a change for marshmallow 4 that would solve this in a better way : #1826.

lafrech avatar Jun 14 '21 07:06 lafrech

@lafrech Thanks for investigating. I think your proposed fix is correct. Do you have time to run with the fix/release?

sloria avatar Jun 19 '21 15:06 sloria

I think the update in the original code is meant to allow a parent to update the nested schema context (as in #1791).

The fix I propose breaks this. I don't see any safe way to allow updating while being thread-safe but this might be a concern.

lafrech avatar Jun 22 '21 08:06 lafrech