cpython
cpython copied to clipboard
Introduce a gate/check GHA job
This adds a GHA job that reliably determines if all the required dependencies have succeeded or not.
It also allows to reduce the list of required branch protection CI statuses to just one — check
. This reduces the maintenance burden by a lot and have been battle-tested across a small bunch of projects in its action form and in-house implementations of other people.
This action is now in use in aiohttp (and other aio-libs projects), CherryPy, some of the Ansible repositories, pip-tools, all of the jaraco's projects (like setuptools
, importlib_metadata
), some of the hynek's projects (like attrs
, structlog
), some PyCQA, PyCA, PyPA and pytest projects, a few AWS Labs projects. Admittedly, I maintain a few of these but it seems to address some of the pain folks have: https://github.com/jaraco/skeleton/pull/55#issuecomment-1106638475.
I saw a few attempts to reduce the maintenance burden for GHA and figured this might be useful for CPython too, which is why I'm submitting this patch. Looking forward to hearing what you think!
The story behind this is explained in more detail at https://github.com/marketplace/actions/alls-green#why.
Thanks for the PR!
Some points to consider:
-
This changes the security levels: a small group of 6 org owners + 8 repo admins can change the required checks via repo settings. With this PR, any of the ~90 active core devs can merge changes to this file. Is that okay? I think it's probably fine, we're not deploying or anything from here, but let's make sure.
-
Whilst this does reduce maintenance burden in administering which checks are required or not, is that something that needs changing often? For projects that test a big matrix of Python versions, it's definitely a plus: no need to add/remove version-specific checks via the interface once or twice a year. Maybe less so here?
-
Has this ~14 vs. ~90 been a bottleneck in the past?
@hugovk these are good points and I haven't considered some of them.
- I should probably document the security considerations in README. It seems like this might be solvable by requiring reviews from code owners, for example. This should limit the scope of the CI updates made flying under the radar. It also occurred to me that the repository settings updates happen unnoticed, with no traceability for the larger public — only the admins would be able to see this in the security audit log and only if they specifically search for such changes. Whereas with the branch-protection-as-a-code, it's all there, verifiable via https://github.com/python/cpython/commits/main/.github/workflows/build.yml.
- I've been offering similar PRs to many projects with matrixes of various sizes, and I've seen the maintainers forgetting to add new matrix factors. After all, it's a mental overhead having to memorize to perform some extra action items far from the place where you're editing the source code, especially since they aren't even in the code.
- That's not something I can observe/verify :man_shrugging: — maybe @ezio-melotti knows?
I don't know if this is worth it for CPython as a number of jobs we require are outside of GitHub Actions (CLA, Issue Number, and News). So this check will increase complexity but won't actually solve the problem of bringing all required checks in one place.
@hugovk , as you've worked a lot of CI stuff lately across the various Python-org repos, do you have further feedback on this?
- I should probably document the security considerations in README. It seems like this might be solvable by requiring reviews from code owners, for example.
Just to note, that would be a pretty huge change in terms of workflow for this repo, since CODEOWNERS
is used here for much more than just who is required to review certain changes. I'm not sure that itself is a dealbreaker for this proposal, but certainly something to keep in mind.
Agreed with @ambv: this looks like significant added complexity for not much actual gain in this project. It looks like a nice idea for some other setups, but doesn't feel like a great fit for us here. It's also possible I'm just missing the point, though :)
- I should probably document the security considerations in README. It seems like this might be solvable by requiring reviews from code owners, for example.
Just to note, that would be a pretty huge change in terms of workflow for this repo, since
CODEOWNERS
is used here for much more than just who is required to review certain changes.
FYI the SC has explicitly stated individuals cannot act as owners on any one piece of code (i.e. we all own the code), so requiring CODEOWNERS
approval is a non-starter IMO.
Alright, so I'm closing this based on the latest feedback. Thanks for putting this together, @webknjaz, and sorry it wasn't a good match for us at this time!
Given that our requirements changed, with Auto-merge enabled, and more checks existing now, I would like to reconsider this PR for inclusion.
@webknjaz, would you be so kind as to adapt this to the current state of main
?
@ambv so I just realized that the docs are its own workflow. The only way this is gonna work is wiring that in as a reusable workflow.
A quick summary of our in-person interaction. Łukasz and I agreed to implement the following:
- rename the
check
job ID and name - convert the docs workflow into a reusable one
- integrate that into the main
build
workflow, making use of the dynamic change detection that's already present there - move the "allowed failures" declarations to the check
@ambv do we need to do the same to the build_msi
workflow?
@webknjaz would you be able to explain a little about 'reusable workflow' for the documentation workflow? I don't know what makes it a special case, or what a reusable workflow is!
A
explain a little about 'reusable workflow' for the documentation workflow? I don't know what makes it a special case, or what a reusable workflow is!
TL;DR it's a way to include one workflow into another: https://docs.github.com/en/actions/using-workflows/reusing-workflows.
Łukasz and I talked at the PyCon Sprints and I explained in person how it's useful for this case. Basically, you cannot depend on jobs running in separate workflows, so this patch makes it so the docs workflow is included inside the main one and doesn't get executed as a standalone thing. It is a nice way of organizing bits of automation without having to shove everything in the same file. Reusable workflows can also exist in external repositories, but that's not our case. I did show other workarounds for making auto-merge work, but this one is the most elegant solution.
Please, check out this graph to see what this looks like: https://github.com/python/cpython/actions/runs/4814439259.
@ambv do we need to do the same to the
build_msi
workflow?
And a follow-up question: should I submit the first commit (the reusable workflow intro) as a separate PR so that it's more or less atomic?
@AA-Turner oh, and in addition to the above I suggested Łukasz not to add the new job to the branch protection right away but rather observe and allow for some grace period before going full-in. So this way it won't introduce any friction or side effects to people's experiences.
To confirm my understanding: the 'all-required-green' job works iff all required jobs are within a single workflow.
I suppose my challenge would be extensibility for the future, as this might make it awkward to add more jobs. Say in a hypothetical world we want to require that the verify wheels action passes on every PR -- if I'm correct this would require re-writing the action and including it within the build.yml
file. Could you confirm my understanding?
Is there a way to keep the benefits of the 'configuration as code' approach that this takes whilst still maintaining distinct workflows (for extensibility purposes), perhaps via uploading a dummy artefact or etc to a well-known location on GHA or some other way of sharing state?
A
Not sure about build_msi
; I would leave it be for now.
You could make the reusable intro a separate PR to make reviewing this one easier.
Hey @AA-Turner, thanks for the questions! I hope my answers below will be able to shed some light:
To confirm my understanding: the 'all-required-green' job works iff all required jobs are within a single workflow.
Yes. Alternatively, there could be many workflows with disconnected matrixes and each would need such a job (though, with different names!) — every each of them should need to be added to the branch protection.
This works by adding jobs as dependencies via needs:
which is not possible across workflows.
I suppose my challenge would be extensibility for the future, as this might make it awkward to add more jobs. Say in a hypothetical world we want to require that the verify wheels action passes on every PR -- if I'm correct this would require re-writing the action and including it within the
build.yml
file. Could you confirm my understanding?
Kinda. I'd like to clarify that it's nicer to keep things in different YAML files. And the inclusion would just be a one-liner with a uses:
, not physically copying the contents over. The result would be that the definition lives in its own dedicated YAML file, while being a part of the common workflow in the runtime (test time?).
A unified workflow also allows making use of various change detection techniques, once and calling some workflows conditionally.
I would even argue that most of the jobs that are currently declared in the build.yml
workflow file should move out at some point, for better separation of different workflow layers. Though, this is definitely a separate discussion as I want to keep the PR atomic, focused on a single thing, without causing a scope creep.
Is there a way to keep the benefits of the 'configuration as code' approach that this takes whilst still maintaining distinct workflows (for extensibility purposes), perhaps via uploading a dummy artefact or etc to a well-known location on GHA or some other way of sharing state?
I'm not sure what you're asking here. But as mentioned above, I'd very much recommend splitting out things and keeping on the high-level inclusion and conditionals in the main one. If you're asking about sharing artifacts across jobs in general, that is certainly possible with the “stdlib” actions/upload-artifact and actions/download-artifacts. The workflows could refer to the same artifact names. Also, the reusable workflows can accept inputs, just like actions, so the caller could pass some “arguments” into the called workflows (think of in like calling functions, in a way).
@ambv I'll submit the PR shortly. In the meanwhile, I noticed a bug in the docs workflow that has existed there for a while and is being ignored almost silently — it's here https://github.com/python/cpython/actions/runs/4814439259/jobs/8572091258#step:9:3. There are two problems with that — touch
does not get any args so it produces an error output. And a multiline shell script executed without set -eEuo pipefail
that keeps running after one of the commands in the middle failed, and the final outcome is the result of the very last instruction in that script.
The separate PR is here: https://github.com/python/cpython/pull/103914.
Say in a hypothetical world we want to require that the verify wheels action passes on every PR -- if I'm correct this would require re-writing the action and including it within the
build.yml
file.
I realized that I didn't fully respond to this since I didn't understand what you imply. I suppose you were also curious about conditional runs/change detection. That would indeed move to the main workflow. Though, the change detection itself can be separated out too, if needed. Either as a reusable workflow or as a composite action. We can solve this elegantly when the time comes.
@ambv I'll submit the PR shortly. In the meanwhile, I noticed a bug in the docs workflow that has existed there for a while and is being ignored almost silently — it's here python/cpython/actions/runs/4814439259/jobs/8572091258#step:9:3. There are two problems with that —
touch
does not get any args so it produces an error output. And a multiline shell script executed withoutset -eEuo pipefail
that keeps running after one of the commands in the middle failed, and the final outcome is the result of the very last instruction in that script.
Hmm, in that case there's no modified docs in the PR, so nothing to touch
and rebuild. It could be nicer and skip when there are no files.
But more importantly, since this PR was opened we've refactored it to touch via a Python script instead of the touch
command, so the behaviour may have changed too. Something to check in the other PR.
Hmm, in that case there's no modified docs in the PR, so nothing to touch and rebuild. It could be nicer and skip when there are no files.
Well, the main thing I wanted to point out was that it's best to avoid multicommand shell script steps, otherwise you have to remember to include the shell mode configuration that would help avoid introducing unnoticeable shell misbehaviors. The later implies extra cognitive load, through, so I prefer the former.
But more importantly, since this PR was opened we've refactored it to touch via a Python script instead of the touch command, so the behaviour may have changed too. Something to check in the other PR.
If that's the case, it's a good thing.
@ambv @hugovk I think this is ready.
@ambv @hugovk let's see if this is ready now...
@hugovk IIRC Łukasz and I talked about merging this PR but not changing the branch protection settings for a while so the check name would start being reported in the Checks widget on old PR updates, otherwise they'll get blocked until updated.
Thanks again!
GH-107114 is a backport of this pull request to the 3.12 branch.
GH-107115 is a backport of this pull request to the 3.11 branch.
UPD: this has been backported to 3.12 and 3.11 and the branch protection changed accordingly.