marshmallow-sqlalchemy icon indicating copy to clipboard operation
marshmallow-sqlalchemy copied to clipboard

Memory usage creep in >=1.4.0

Open mistercrunch opened this issue 8 months ago • 12 comments

After a fair amount of bissecting to pinpoint this issue, I wanted to surface the fact that our CI unit test suite fails consistently when using >=1.4.0 (in Apache Superset). I'm guessing it's related to the changes around PickleType (?).

Wondering if there's memory leakage around the change, or if it's expected to use more memory, pushing us over the edge of the memory we have in CI.

Thought I'd surface it here as I'm pinning the dependency to marshmallow-sqlalchemy>=1.3.0,<1.4.0 in our repo, and will reference this issue as a comment above our pin rule.

Note that the issue could be from 1.4.1 too as I didn't bissect all the way, but given my experience with pickles I thought the issue is likely related to that.

Image

mistercrunch avatar Mar 19 '25 17:03 mistercrunch

thank you for reporting! will investigate this soon.

the PickleType fix is an unlikely culprit, since it just ensures that models with a PickleType column properly get a marshmallow fields.Raw field in the corresponding Schema (iirc previously an error was raised).

https://github.com/marshmallow-code/marshmallow-sqlalchemy/pull/657 is something we should double-check, since it adds code that runs on schema class declaration (i.e. in the metaclass).

sloria avatar Mar 19 '25 19:03 sloria

given that the suspected PR is in 1.4.1, @mistercrunch would you mind trying marshmallow-sqlalchemy==1.4.0 to see if it's affected?

sloria avatar Mar 19 '25 19:03 sloria

Oh wait, closed it by mistake, please reopen, turns out I can recreate consistently on 1.4.1. See it on this PR https://github.com/apache/superset/pull/32762, here's the CI job: https://github.com/apache/superset/actions/runs/13958571093/job/39075360691?pr=32762

Issue/symptom manifests as a simple Error: The operation was canceled., which from experience is what happens when GHA runs out of memory.

Image

mistercrunch avatar Mar 19 '25 23:03 mistercrunch

Here's the output of git diff 1.4.1 1.4.0 on your repo, couldn't find anything obvious in the diff...

$ git diff 1.4.1 1.4.0
diff --git a/src/marshmallow_sqlalchemy/fields.py b/src/marshmallow_sqlalchemy/fields.py
index c016eb0..2db28ca 100644
--- a/src/marshmallow_sqlalchemy/fields.py
+++ b/src/marshmallow_sqlalchemy/fields.py
@@ -160,4 +160,4 @@ def get_primary_keys(model: type[DeclarativeMeta]) -> list[MapperProperty]:
 
 
 def ensure_list(value: Any) -> list:
-    return list(value) if is_iterable_but_not_string(value) else [value]
+    return value if is_iterable_but_not_string(value) else [value]
diff --git a/src/marshmallow_sqlalchemy/load_instance_mixin.py b/src/marshmallow_sqlalchemy/load_instance_mixin.py
index 4d5c9ba..0353d62 100644
--- a/src/marshmallow_sqlalchemy/load_instance_mixin.py
+++ b/src/marshmallow_sqlalchemy/load_instance_mixin.py
@@ -8,9 +8,7 @@
 
 from __future__ import annotations
 
-import importlib.metadata
-from collections.abc import Iterable, Mapping, Sequence
-from typing import Any, Generic, TypeVar, Union, cast
+from typing import TYPE_CHECKING, Any, Generic, TypeVar, cast
 
 import marshmallow as ma
 from sqlalchemy.ext.declarative import DeclarativeMeta
@@ -19,16 +17,10 @@ from sqlalchemy.orm.exc import ObjectDeletedError
 
 from .fields import get_primary_keys
 
-_LoadDataV3 = Union[Mapping[str, Any], Iterable[Mapping[str, Any]]]
-_LoadDataV4 = Union[Mapping[str, Any], Sequence[Mapping[str, Any]]]
-_LoadDataT = TypeVar("_LoadDataT", _LoadDataV3, _LoadDataV4)
-_ModelType = TypeVar("_ModelType", bound=DeclarativeMeta)
-
+if TYPE_CHECKING:
+    from collections.abc import Iterable, Mapping
 
-def _cast_data(data):
-    if int(importlib.metadata.version("marshmallow")[0]) >= 4:
-        return cast(_LoadDataV4, data)
-    return cast(_LoadDataV3, data)
+_ModelType = TypeVar("_ModelType", bound=DeclarativeMeta)
 
 
 class LoadInstanceMixin:
@@ -122,7 +114,7 @@ class LoadInstanceMixin:
 
         def load(
             self,
-            data: _LoadDataT,
+            data: Mapping[str, Any] | Iterable[Mapping[str, Any]],
             *,
             session: Session | None = None,
             instance: _ModelType | None = None,
@@ -144,13 +136,13 @@ class LoadInstanceMixin:
                 raise ValueError("Deserialization requires a session")
             self.instance = instance or self.instance
             try:
-                return cast(ma.Schema, super()).load(_cast_data(data), **kwargs)
+                return cast(ma.Schema, super()).load(data, **kwargs)
             finally:
                 self.instance = None
 
         def validate(
             self,
-            data: _LoadDataT,
+            data: Mapping[str, Any] | Iterable[Mapping[str, Any]],
             *,
             session: Session | None = None,
             **kwargs,
@@ -159,7 +151,7 @@ class LoadInstanceMixin:
             self._session = session or self._session
             if not (self.transient or self.session):
                 raise ValueError("Validation requires a session")
-            return cast(ma.Schema, super()).validate(_cast_data(data), **kwargs)
+            return cast(ma.Schema, super()).validate(data, **kwargs)
 
         def _split_model_kwargs_association(self, data):
             """Split serialized attrs to ensure association proxies are passed separately.
diff --git a/src/marshmallow_sqlalchemy/schema.py b/src/marshmallow_sqlalchemy/schema.py
index 30452c0..ba7b934 100644
--- a/src/marshmallow_sqlalchemy/schema.py
+++ b/src/marshmallow_sqlalchemy/schema.py
@@ -1,11 +1,10 @@
 from __future__ import annotations
 
-import inspect
 from typing import TYPE_CHECKING, Any, cast
 
 import sqlalchemy as sa
 from marshmallow.fields import Field
-from marshmallow.schema import Schema, SchemaMeta, SchemaOpts, _get_fields
+from marshmallow.schema import Schema, SchemaMeta, SchemaOpts
 
 from .convert import ModelConverter
 from .exceptions import IncorrectSchemaTypeError
@@ -119,7 +118,7 @@ class SQLAlchemySchemaMeta(SchemaMeta):
             cls_fields,
             # Filter out fields generated from foreign key columns
             # if include_fk is set to False in the options
-            mcs._maybe_filter_foreign_keys(inherited_fields, opts=opts, klass=klass),
+            mcs._maybe_filter_foreign_keys(inherited_fields, opts=opts),
             dict_cls,
         )
         fields.update(mcs.get_declared_sqla_fields(fields, converter, opts, dict_cls))
@@ -160,7 +159,6 @@ class SQLAlchemySchemaMeta(SchemaMeta):
         fields: list[tuple[str, Field]],
         *,
         opts: SQLAlchemySchemaOpts,
-        klass: SchemaMeta,
     ) -> list[tuple[str, Field]]:
         if opts.model is not None or opts.table is not None:
             if not hasattr(opts, "include_fk") or opts.include_fk is True:
@@ -170,39 +168,13 @@ class SQLAlchemySchemaMeta(SchemaMeta):
                 for column in sa.inspect(opts.model or opts.table).columns  # type: ignore[union-attr]
                 if column.foreign_keys
             }
-
-            non_auto_schema_bases = [
-                base
-                for base in inspect.getmro(klass)
-                if issubclass(base, Schema)
-                and not issubclass(base, SQLAlchemyAutoSchema)
-            ]
-
-            def is_declared_field(field: str) -> bool:
-                return any(
-                    field
-                    in [
-                        name
-                        for name, _ in _get_fields(
-                            getattr(base, "_declared_fields", base.__dict__)
-                        )
-                    ]
-                    for base in non_auto_schema_bases
-                )
-
-            return [
-                (name, field)
-                for name, field in fields
-                if name not in foreign_keys or is_declared_field(name)
-            ]
+            return [(name, field) for name, field in fields if name not in foreign_keys]
         return fields
 
 
 class SQLAlchemyAutoSchemaMeta(SQLAlchemySchemaMeta):
     @classmethod
-    def get_declared_sqla_fields(
-        cls, base_fields, converter: ModelConverter, opts, dict_cls
-    ):
+    def get_declared_sqla_fields(cls, base_fields, converter, opts, dict_cls):
         fields = dict_cls()
         if opts.table is not None:
             fields.update(

mistercrunch avatar Mar 19 '25 23:03 mistercrunch

ok, thanks for narrowing it down. will investigate further within the next few days

sloria avatar Mar 20 '25 03:03 sloria

sorry for the delay on this. haven't been able to dedicate time to marshmallow these past couple weeks. would definitely appreciate help on this, whether if it's an incomplete PR or an MRE

sloria avatar Mar 30 '25 15:03 sloria

On my side I likely won't be able to drive this. After running the diff above I have zero intuition as to what could have caused a memory-usage regression. I think I isolated the exact version, and from the diff there's nothing in there that looks like the culprit.

Maybe there's something I'm missing - I thought I'd see at least one thing that points in the direction of "oh this could be it", but can't seem to find it. Any idea what it could be?

On my side one thing I can do is double check that 1.4.1 is the actual issue (and that 1.4.0 is ok), but i think I did that already...

mistercrunch avatar Apr 01 '25 00:04 mistercrunch

thanks! and no worries if you can't drive this. even double-checking that 1.4.1 is the culprit version is a big help.

i suspect the this PR (the only one in 1.4.1) https://github.com/marshmallow-code/marshmallow-sqlalchemy/pull/657/files#diff-f6ccdfd29f2042e6b843cb0defd082f4a01abf90cdb3d18c9aa01ff7d57e8b19R174-R191 could be the issue because it creates intermediate lists upon class declaration. don't have time atm to do a code dive for a solution but it's a place to start

sloria avatar Apr 02 '25 14:04 sloria

Submitted a PR, a bit of a shot in the dark, but should improve perf a bit as it's preventing recomputing things in an inner loop. https://github.com/marshmallow-code/marshmallow-sqlalchemy/pull/667

mistercrunch avatar Apr 02 '25 15:04 mistercrunch

Thanks for the PR! I've released it in 1.4.2. @mistercrunch wanna give it a shot in the superset tests? 👀

sloria avatar Apr 09 '25 23:04 sloria

Oh 4.1.2 is out, let me try it 🤞 🤞

mistercrunch avatar Apr 10 '25 00:04 mistercrunch

Oh no! didn't fix the issue as seen here -> https://github.com/apache/superset/pull/33166

Logs -> https://github.com/apache/superset/actions/runs/14523061692/job/40750749107?pr=33166#step:5:126

Pretty sure that Error: Process completed with exit code 143. means GHA container ran out of memory and exited :(

For the record, we have thousands of runs succeeding on 4.1.0 and still pretty certain that something on >= 4.1.0 triggers the issue.

mistercrunch avatar Apr 17 '25 22:04 mistercrunch