superset
superset copied to clipboard
chore: [proposal] de-matrix python-version in GHAs
SUMMARY
In this PR:
- simplifying the single-item matrix (python-version) to NOT using a matrix. I'm guessing the reason we currently have a single-item matrix is an artifact of supporting multiple version in the past, and/or making it easy to go multi-python-version checks in the future, but there's a burden associated, especially around how this relates to "required checks" specified in .asf.yml
- leveraging the
setup-backend
's default for python version, making the main python version we use much more DRY. - fixing/simplifying the related no-op workflows. We'll need new ones, but will be able to deprecate a bunch and simplify things. For instance, when we migrate to 3.11 in the future, we won't have to manage a bunch of python-version-specific no-ops
About supporting multiple/future version of python, I'd argue that we should focus on a single one for a given CI run, and that if/when we need to CI against multiple version, we run a FULL test suite punctually in a dedicate PR/branch/ref. Point being, it's expensive for every commit to validate multiple versions of python and in many ways its not necessary.
Currently our multi-python-version support is dubious at best, with only few checks that run against multiple versions. I really think we should pick a single version and support it very well. If/when we want to upgrade python version, we'd cut a PR and run CI for that purpose.
If we want to continuously, actively support multiple python versions (and I don't think we should!), I'd suggest either a release-specific procedure (release manager using release branch, running full CI for that version/release) and/or a nightly job that would keep an eye on that version of python.
I'm all for just "officially" supporting a single PY version going forward. Most other projects do this AFAIK, so what's stopping us? :)
the only benefit of supporting multiple python version is to increase the chance to offer more compatible client/driver libraries for source data to the end users
example I use the database XXX , the python client is not compatible with python 3.11 , I want use latest version of Apache Superset version X.Y.Z that only support 3.11
a policy like supporting the last 2 version of python with active support could e a first step, wdyt ?
Right, though we need to define what "support" means. Some criteria:
- providing an official Docker release for the version of python - probably running the CI/test suite at release time
- running all CI unit/integration tests on that version of python at all times, insuring that say
master
is compatible with that version of python at most times - not doing pro-active tests, but also not restricting the python package letting people "run at their own risk" (this seems never desirable, and is kind of what we're doing currently)
What we're doing now:
- python package let's user install on 3.9, 3.10 and 3.11 I believe, based on the pyproject.toml
- we run the full test suite on 3.10
- we run one or two test suite on 3.11
- Official docker images are all 3.10
Capturing the conversation from the Release Strategy Group this morning:
- let's support a single python version officially per release, and run CI against that one mainly
- let's keep preventing "future regressions" on future python version by running some minimal CI against the future version of python (currently test-postgres against 3.11). For now this is the only place where we need a proper matrix. I might try to rename to "current" and "future" to address the required check issues