SBOM for Github Actions workflow files
What would you like to be added:
I would like to be able to track my github actions workflows as an SBOM, such that if I have a action with uses: actions/[email protected], I get out an SBOM that contains this action, and which actions in turn it contains.
AFAIK, there is no standard today for how to express github actions as an SBOM, but if we resolve each tag to a commit, and then recursively follow each workflow, it should be Doableβ’.
Why is this needed: It will let users of full semver tag, as in v1.2.3 as opposed to v1, detect if the tag has moved, as they can diff the generated SBOM against the previous SBOM. It will also let GHA users track which direct and transitive workflows they're pulling in via their GHAs
Additional context:
π Thanks for the issue @audunmo!
We do currently do some of this work with syft:
syft dir:.github/ (base)
β Indexed file system .github
β Cataloged contents bdcc6a2a85f645f62724fe8dafbf0581cb0c1d65f6a76cb2985a9172e31a473c
βββ β Packages [23 packages]
βββ β Executables [0 executables]
βββ β File digests [9 files]
βββ β File metadata [9 locations]
[0000] WARN no explicit name and version provided for directory source, deriving artifact ID from the give
NAME VERSION TYPE
./.github/actions/bootstrap UNKNOWN github-action (+2 duplicates)
actions/cache 0c907a75c2c80ebcb7f088228285e798b750cf8f github-action
actions/checkout v4.1.1 github-action (+2 duplicates)
actions/checkout v4.2.1 github-action
actions/checkout v4.2.2 github-action
actions/setup-go 93397bea11091df50f3d7e59dc26a7711a8bcfbe github-action
actions/setup-go v5.0.0 github-action
anchore/workflows/.github/actions/update-go-dependencies main github-action
anchore/workflows/.github/workflows/dependabot-automation.yaml main github-action-workflow
anchore/workflows/.github/workflows/oss-project-board-add.yaml main github-action-workflow
anchore/workflows/.github/workflows/release-install-script.yaml main github-action-workflow
anchore/workflows/.github/workflows/remove-awaiting-response-label.yaml main github-action-workflow
fountainhead/action-wait-for-check v1.2.0 github-action
github/codeql-action/analyze v3.23.2 github-action
github/codeql-action/autobuild v3.23.2 github-action
github/codeql-action/init v3.23.2 github-action
peter-evans/create-pull-request v7.0.5 github-action
tibdex/github-app-token v2.1.0 github-action
zizmorcore/zizmor-action v0.1.1 github-action
If one of these were filed https://github.com/advisories?query=type%3Areviewed+ecosystem%3Aactions we would be able to do the match and report the vulnerability via the SBOM from syft. Are you looking for more support along the lines of this sentence:
I get out an SBOM that contains this action, and which actions in turn it contains.
For pulling the transitive dependencies and doing a fuller analysis of the action see our discussion on the live stream this week on some of the challenges around that: https://www.youtube.com/c/Anchore
Ah, that's super interesting then! Thanks @spiffcs. I also appreciate the discussion you had on the stream
What I'd be most interested is solving a situation like this one:
Let's say that something like the compromise of tj-actions happens again. What I would want to know is whether or not my builds are affected.
In such a case, I'd need to know if there's a repo in my org that runs tj-actions directly, or if someone in my org is using an action that imports it transitively. With the current setup, I'd only be able to do so with the direct import case.
Concretely, how I would like this to be solved, is to have a SBOM that functions as a call-graph of actions. I've supplied a highly simplified example below.
actions-bom:
actions:
- name: actions/checkout
version: 4.3.2
sha: <commit-hash-for-v4.3.2>
references:
- name: actions/setup-go
version: 3.3.3
sha: <commit-hash-for-v3.3.3>
references:
- name: tj-actions/changed-files
version: 1.2.3
sha: <commit-hash-for-v1.2.3>
This would give me visibility into if we use the action, and if we're doing so on a compromised version.
It doesn't quite go as deep as what you all alluded to in the stream, but I think this is deep enough to still get some value. It would of course be nice to see the node or container dependencies as well, but as you touch on that does become more complicated to do
Addendum:
A thing to note here is that the version tag and the sha of the commit are both stored. Equipped with this kind of format, I could create an action that I set to run before all actions. On the first run, it would simply create and persist the SBOM. On subsequent runs, it would diff the new SBOM against the previously generated one. I could then check if a version tag stayed static, but the sha changed. This would let me mitigate the type of attack that the tj-actions compromise was used for, as I would catch the diff and fail the build
This probably lies outside of the intended use of Syft, so I wouldn't expect you to implement that part, but if I have the SBOM generated by Syft, then this is a capability that Syft can enable me to build
Actions have the added complexity that we need to resolve many of the "version" values to understand what version the action actually is, as you noted many users simply use @v3, @latest, @main or a full SHA which may or may not be identical to a version tag. We do parse versions in comments, to support luckily what dependabot outputs like this:
uses: actions/cache@0400d5f644dc74513175e3cd8d07132dd4860809 # v4.2.4
We already support online resolution of various things including Maven POM files and Golang licenses, which can include downloading git repositories. We could probably use the GitHub API to figure out some of this, but there's the very real possibility that users will get rate limited very quickly, so we will need to take the into consideration.
I think it would make sense to add an enhancement step to at least to get a transitive graph for composite actions and to resolve version information when we don't have a full semver directly in the yaml we scanned. If someone works on this, we should add appropriate options so this works with the --enrich flag, e.g. --enrich all and --enrich github-actions or whatever term(s) make sense.
We already support online resolution of various things including Maven POM files and Golang licenses, which can include downloading git repositories. We could probably use the GitHub API to figure out some of this, but there's the very real possibility that users will get rate limited very quickly, so we will need to take the into consideration.
Yeah, the ratelimit can be a problem. Maybe the user should be able to submit a github token. Authenticated requests have a significantly higher ratelimit