superset
superset copied to clipboard
Make Superset compatible with `marshmallow>=4.0.0`
Bug description
On a fresh install of superset, since marshmallow 4.0.0 released https://pypi.org/project/marshmallow/, superset fails to init due to an error validating parameters
It is possible to work around this by manually installing marshmallow at 3.26.1
pip3 install marshmallow==3.26.1
It looks like the version is being set in requirements/development.txt, so I'm unsure as to where the version update is occurring
Screenshots/recordings
No response
Superset version
master / latest-dev
Python version
3.9
Node version
16
Browser
Not applicable
Additional context
File "/home/ec2-user/.local/lib/python3.9/site-packages/superset/charts/schemas.py", line 621, in ChartDataProphetOptionsSchema periods = fields.Integer( File "/home/ec2-user/.local/lib/python3.9/site-packages/marshmallow/fields.py", line 977, in init super().init(as_string=as_string, **kwargs) File "/home/ec2-user/.local/lib/python3.9/site-packages/marshmallow/fields.py", line 924, in init super().init(**kwargs) TypeError: init() got an unexpected keyword argument 'min'
Checklist
- [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [x] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [x] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Hello @MattBodey! I'm Dosu, a bot here to help you with bugs, answer your questions, and guide you on your journey to becoming a contributor. I'm ready to assist while you wait for a human maintainer. Let’s tackle this together!
The issue you're encountering is due to a compatibility problem between Superset and Marshmallow 4.0.0. Specifically, the fields.Integer class in Marshmallow 4.0.0 no longer supports the min keyword argument, which is causing the error during Superset's initialization.
To resolve this issue, you can downgrade Marshmallow to version 3.26.1, where the min keyword argument is still supported. You can do this by running the following command:
pip3 install marshmallow==3.26.1
This should allow Superset to initialize without errors. It seems that the version of Marshmallow is specified in requirements/development.txt, but there might be an override or update happening elsewhere in the setup process. You may want to check other configuration files or scripts that could be specifying a different version of Marshmallow.
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other
Is there somewhere we need to pin the dependency? It looks like Superset already installs 3.26.1. I'm not sure about the details of the failure, but I know we also use marshmallow-sqlalchemy which doesn't seem to support marshmallow 4.0 yet if I'm understanding correctly.
I followed the PyPi install instructions, and running pip install marshmallow==3.26.1 just before superset db upgrade did the trick.
Hi there,
we are running a little integration test suite against Apache Superset and CrateDB, in order to validate both work together well. Two days ago, CI apparently started tripping on it.
TypeError: Field.__init__() got an unexpected keyword argument 'minLength'
Do you think we need to adjust a few details, or is it just about marshmallow-4.0.0, which has been released two days ago, as suggested above?
With kind regards, Andreas.
References
- https://github.com/crate/cratedb-examples/tree/main/application/apache-superset
- https://github.com/crate/cratedb-examples/actions/workflows/application-apache-superset.yml
- https://github.com/crate/cratedb-examples/actions/runs/14545171337/job/40820574564#step:7:352
@amotl I guess either
- superset stop using
minLengthand generally adapts to Marshallow v4 - or superset ensures Marshmallow v3.26.1 is pinned all the way
- or Marshmallow readds
minLengthand others
In the meantime we must force Marshmallow version. How? Depends on how you install superset.
Hi. Thank you very much.
In the meantime we must force Marshmallow version. How? Depends on how you install superset.
I don't completely understand this, yet. Isn't it possible to just add a corresponding version pinning constraint to Superset's main dependencies to handle the situation well, until Superset will be compatible with Marshmallow 4?
Marshmallow is already pinned to the right version in development.txt, via marshmallow-sqlalchemy module requirements, but somehow v4.0.0 gets installed in the end.
Depends on how you install Superset.
If you install it via pip install superset, Kubernetes or download the docker image, you won't force Marshmallow version the same way. I'm not fluent in Docker, even less in Kubernetes, so I went the PyPi way to fix it.
Hi Colin, thanks for showing the right dependency pinning in development.txt. The same, even more important, should also be conducted for the runtime dependencies, right? Otherwise, it will leave the average user stranded like outlined in this thread.
It looks like the dependency of Marshmallow is also pinned per base.txt ^1. Do you have an idea why it isn't used at runtime?
Hi again. We submitted a patch to the 4.1 branch, in order to improve the situation for Superset 4.
- GH-33216
Hi is this fixed? Or can I be assigned this one?
Hi. I think it's "in progress". I've just refreshed the patch GH-33216.
Hi @mistercrunch. It's just a humble question about your plans: Is there a bugfix release for Apache Superset 4 scheduled, which would include the fix GH-33216? Or are other patches pending that want to be bundled into the same release?
Oh gotcha, I didn't look at the internal logic around dependency management, but normally we recommend installing the pinned dependencies (as we do in our Dockerfile). The pinned deps are the .txt here https://github.com/apache/superset/tree/master/requirements .
From my understanding we never pointed to marshmallow==4.0.0 in those files in master or any releases since would have failed our CI. You may want to alter your build process to install those pinned dependency to align with the libs we run our CI against and ship in our Docker.
Normally, we would have caught this issue and produce a similar fix when our dependabot equivalent (supersetbot) would have tried to upgrade masrshmallow. But glad you caught and fix it!
Wondering if we can close this issue, though we need to eventually upgrade to supporting marshmallow and remove the >=4.0 from that .in file
Any further thoughts (from anyone) on closing this (or not)?
It's not an issue as much as a TODO, it files under normal dependency management. Maybe we open a backend-oriented starter task "make superset support marshmallow>=4.0.0". Could recycle this issue for this purpose.
Couldn't find the "starter-task" label, what is it called nowadays?
@mistercrunch There is a "good first issue" label 😉
@turingnixstyx it's totally up for grabs
Unless anyone here wants to tackle this (feel free!), we'll likely move it to an "Ideas" thread on GitHub Discussions.
Hi. Apache Superset is still failing on our CI, until this will get resolved positively. It is really just about running a maintenance release of Superset 4, now that GH-33216 got merged already, no?
https://github.com/crate/cratedb-examples/actions/workflows/application-apache-superset.yml
Now that this will be addressed with https://github.com/apache/superset/pull/33216 (setting the library version range boundary), this issue is about making Superset compatible with marshmallow>=4.0.0 to that we can remove the boundary and support the future versions of marshmallow. This isn't an urgent matter, but fits with keeping our dependency tree up-to-date. Not sure if "Ideas" is the right place, but I added the good-first-issue label.
@mistercrunch: I see, thank you very much. Can the community still expect a 4.1.3 release, which will resolve the problem in the interim, until other patches will bring in compatibility with marshmallow>=4?
@sadpandajoe and @michael-s-molina have been closer to releases, but I also want to make sure no one gets blocked or depends on a release to deploy Superset. Depending on how you do your own CI and deployments, there should be many ways for you to deploy the previous release. About marshmallow specifically, it should be as simple as "pip installing" marshmallow==3.26.1 prior to pip installing the official apache-superset package. Or better, pip install -r requirements/base.txt to ensure that you install all of the repo's pinned dependencies that are the ones we use for all of our tests / CI workloads. Or even better, use uv pip install as a faster alternative
Hi @mistercrunch. Thanks for outlining a workaround. However, it would be so sweet if we could keep the canonical incantation uv pip install apache-superset==4.* on the CI workflow like it is implemented now.
It is actually meant to be an integration test canary for both us (downstream towards CrateDB Nightly) and you (upstream towards stable releases of Apache Superset), that's why I am commenting here to relay the idea that both Apache Superset 3 and 4, when installing the package from PyPI without much ado, is currently broken since April 17 due to misaligned runtime dependencies. In our humble opinion, this should be fixed.
Better use
pip install -r requirements/base.txt?
Thanks. Unfortunately, this is not an option, because we are trying to follow the idea that users solely use uv pip install as an interface to install packages, without obligatory access to the source repository. [^1]
[^1]: Imagine users consuming the package from a PyPI mirror who do not have any chance to access GitHub at all, due to connectivity restrictions in certain countries.
Right. We should really make sure the package is "self-standing" as much as possible, with good "library-range-supported" semantics.
In theory we could put a ceiling based on semver, meaning we would always assume the a new major version of any package could break things, and even if say marshmallow 4.x isn't release yet, we would assume it could break things and always put a ceiling on the next major across ALL packages.
Now in practice doing this prevents dependabot/supersetbot from opening PRs trying to bump libraries. Maybe we'd need for these integration to alter or look beyond those ceilings.
While this may help the viability of the main package, the reality is that we can only afford to run CI on a single set of pinned deps, as the matrix of testing various version of librairies - especially the matrix of combination - won't be possible. Now wondering if there's a way to add "preferred version of libraries" in pyproject.toml without pinning things... Answer is no -> https://chatgpt.com/share/68505b89-2e1c-8010-b998-596269f508e1
Hi @mistercrunch,
in this particular case we are really just looking at getting the marshmallow dependency fixed by running a regular release, but we can extend the topic into a general discussion, sure.
In theory we could put a ceiling based on semver, meaning we would always assume the a new major version of any package could break things, and even if say marshmallow 4.x isn't release yet, we would assume it could break things and always put a ceiling on the next major across ALL packages.
Yeah, exactly. This is kind of common practice, and makes very much sense?
Now in practice doing this prevents
dependabot/supersetbotfrom opening PRs trying to bump libraries. Maybe we'd need for these integration to alter or look beyond those ceilings.
Can you elaborate how this breaks your workflow? It works perfectly well for us.
With kind regards, Andreas.
Can you elaborate how this breaks your workflow?
Oh simply because the way those bots are configured / designed to operate within the package-defined boundaries. For @supersetbot we control the code that auto-submits PRs to bump libs, so we could do whatever, but for @dependabot I doubt it's configurable to that extent (it offers few configuration options, or at least not the ones we need, which explains the need for @supersetbot in the first place ...)