python-tuf
python-tuf copied to clipboard
document dependabot PR review strategy
This is what I believe makes sense:
- "test" dependencies are mostly pinned to make sure we know which versions are being used and that the current version doesnt't surprisingly start breaking builds: reviewing the contents of test dependency updates is not required. If the tests pass that should be fine. Major updates could warrant a look in the changelog
- actual run time dependency updates should be reviewed: what this means is case dependent, but as a minimum we should check the changelog. Reading the commit log or actual changes may be useful but can also be an unrealistic goal for some dependency updates.
- The purpose of the dependency review is two-fold:
- prevent depending on software that works differently than we expect (so API changes, other functionality changes, bugs)
- prevent depending on software that is actually malicious (this is more relevant the newer the update is as a lot of malicious updates are noticed fairly quickly). It should be noted that pypi package can be malicious without the malicious code being in github: source code review only goes so far.
This isn't documented anywhere: it should be
I thought of touching that in the maintainer guidelines for go-tuf, so I'll be following 👍 One thing I had in mind is in go-tuf we have this policy where at least two people should review and approve PRs before they are merged, but there are dependabot cases where the change is nothing concerning and one approval should be enough too.
cc: @znewman01
This is what I believe makes sense: ...
Makes sense to me as well, and I have nothing really to add in terms of content.
I do have a few ideas for where we could write this down:
- in python-tuf's Governance doc (probably not the right place for wordy instructions, but rather a commitment to follow certain guidelines, that may be phrased out elsewhere)
- in a new Dependency Review Guidelines document, either in python-tuf's docs, or in secure-systems-lab/lab-guidelines, which already hosts the Development Guidelines, which are cited in python-tuf's Instructions for contributors
- as part of the Code Review Guidelines in that same repo, which is also referenced from the Development Guidelines
- ~in a PR template~ (unfortunately not yet supported for Dependabot PRs dependabot-core#2211)
One thing I had in mind is in go-tuf we have this policy where at least two people should review and approve PRs before they are merged, but there are dependabot cases where the change is nothing concerning and one approval should be enough too.
I'd argue that the potential benefits here aren't worth the cost. This complicates the policy for "is this PR merge-able" and there are mechanisms by which a single malicious maintainer could get an update into the dependency tree, then merge it themselves so it's less safe to have this policy. CC @mnm678 who expressed a similar concern
The rest of this thread makes sense to me, especially the "Code Review Guidelines" or new "Dependency Review Guidelines" solution, along with @jku's suggested process. It seems like we should make sure to reference the existing Development Guidelines from the go-tuf repo; I can do that when we update the contributor docs.
I analyzed our (Dependabot handled) dependencies and the results are actually a bit surprising -- this would look very different for an application but it looks like as library developers we don't really need to review that much!?
Because python-tuf is not an application, our python version pins are irrelevant to to our release product. The only Dependabot managed updates that directly affect release security are the GitHub actions (and the "build" python module)! This may be obvious but I had not fully connected the dots.
Some suggestions:
- We probably should pin "build" module as Tier 1: it is currently not pinned.
- We may want to unpin Tier 3: they are really just noise PRs
Dependabot review guidelines
This project uses Dependabot for a lot of things and this means Dependabot makes quite a lot of PRs. How should these PRs be reviewed?
Tier 1: Build, release and runtime dependencies
These dependencies directly affect the release artifacts we produce. Note that we currently have no Python dependencies managed by Dependabot in this group (because python-tuf is not an application). Changes should be reviewed as well as we can:
- (if dependency is python package) Check that the release exists in the upstream source repository
- Optionally check the source changes
- There is typically no rush to upgrade: Waiting a week or two is fine
Examples:
actions/checkout
actions/setup-python
actions/download-artifact
actions/upload-artifact
actions/github-script
softprops/action-gh-release
Tier 2: Pinned dev & test dependencies
Our end user (wheel) dependencies are not pinned but we do pin developer and test dependencies. The purpose of this pinning is to keep test environment stable.
There is no requirement to review these changes -- but a minimal review is polite to fellow developers who will install these pinned versions:
- Check that the release exists in the upstream source repository
Examples:
requirements-pinned.txt
requirements-test.txt
Tier 3: Other GitHub Actions
These dependencies are used by the project but cannot affect our release artifacts or our development environments: There seems to be no reason to review changes or even pin them.
actions/dependency-review-action
github/codeql-action/init
github/codeql-action/analyze
I think that's a good analysis, thanks!
These dependencies are used by the project but cannot affect our release artifacts or our development environments: There seems to be no reason to review changes or even pin them.
The one caveat is: might these dependencies try to exfiltrate credentials of some sort? I think if you don't hand them GH secrets this is probably okay, but it's worth making explicit somewhere.
The one caveat is: might these dependencies try to exfiltrate credentials of some sort? I think if you don't hand them GH secrets this is probably okay, but it's worth making explicit somewhere.
Absolutely: these need to be actions running with no access to secrets and preferably with a GITHUB_TOKEN that has no permissions.
Thanks for the detailed analysis! I agree that it makes sense to pin Tier 1 and unpin Tier 3, and I mostly agree with your review suggestions.
I still think that we should try to review Tier 2 releases. We don't pin dependencies for our users yet, but if a release of a dependency breaks our CI, we would likely add a version constraint. And we should do the same, if we find any problematic code, even if CI passes.
On a side-note, our Tier 2 dependencies usually are easier to review than the GitHub Actions in Tier 1, which often have very large diffs, sometimes with (obfuscated) vendored dependencies.
I still think that we should try to review Tier 2 releases. We don't pin dependencies for our users yet, but if a release of a dependency breaks our CI, we would likely add a version constraint. And we should do the same, if we find any problematic code, even if CI passes.
Oh yeah, I should have added that any pinned versions in pyproject.toml are automatically Tier 1, it's just the requirements files that are not "production critical"
This is brilliant work, thanks for taking the time to think this through and document it. I completely agree with your statements with the additional constraint in https://github.com/theupdateframework/python-tuf/issues/2014#issuecomment-1335276083 that "any pinned versions in pyproject.toml are automatically Tier 1".
Because python-tuf is not an application, our python version pins are irrelevant to to our relase product. The only Dependabot managed updates that directly affect release security are the GitHub actions! This may be obvious but I had not fully connected the dots.
I think many of us had not connected these dots, at least in part because legacy python-tuf really did look like an application.
Just came across this data point about pinning dev dependencies (from a blog post by @hynek):
[...] it’s more practical to fix breaking CI as it happens (which is rare) instead of becoming one of the projects whose majority of commits are dependency updates.
I agree that churn is something to consider here.
This is actually more relevant and actionable now with Dependabot groups: https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#groups -- it allows us to document the policy AND make the computer do most of the work.
We could define three groups (tiers) of dependencies as described in this issue. Each group then gets a "combined" dependabot PR at selected intervals. This way there's less PRs, and the important updates are easier to notice (because they're in the Tier 1 update).
I'll try this in the next weeks if no one beats me to it.
[...] it’s more practical to fix breaking CI as it happens (which is rare) instead of becoming one of the projects whose majority of commits are dependency updates.
I don't see the number of commits as an issue but maintainer time absolutely is one... There is a difference in the maintainer time spent on dependabot PRs and maintainer time spent on fixing CI failures though: the former I can do when I feel like it, the latter always happens at the worst time possible -- this is why I personally like tight dependency pinning for most things