icu ICU-22482 Hash-pin GitHub Actions, keep them updated with Dependabot

Checklist

[x] Required: Issue filed: https://unicode-org.atlassian.net/browse/ICU-22482
[x] Required: The PR title must be prefixed with a JIRA Issue number.
[x] Required: The PR description must include the link to the Jira Issue, for example by completing the URL in the first checklist item
[x] Required: Each commit message must be prefixed with a JIRA Issue number.
[x] Issue accepted (done by Technical Committee after discussion)
[ ] Tests included, if applicable
[ ] API docs and/or User Guide docs changed or added, if applicable

This PR hash-pins all GitHub Actions in workflows to protect them from broken or malicious releases.

It also sets up dependabot to sent a single monthly PR to update all Actions with new versions. The PR will update both the hashes and version comments.

Sorry for not sending this PR sooner, I hadn't noticed that the issue had been accepted in JIRA soon after I'd sent it 🤦.

Nov 16 '23 22:11 pnacht

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

Nov 22 '23 23:11 jira-pull-request-webhook[bot]

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

Nov 22 '23 23:11 jira-pull-request-webhook[bot]

@pnacht Thanks for the PR! When discussing the topic of whether to use a Github Action's semver or commit hash with @mihnita, we observed the pros & cons of each:

Semver:

pro: Using a semver tag like v3 allows you to automatically benefit from updates within the major version
con: You need to trust the Action author, and the author needs to adhere to the convention

Pin to commit hash:

pro: You never risk breakage due to upstream changes
con: You do not automatically pick up upstream improvements

The reason we got into that discussion is because we didn't find any clear, authoritative documentation on what best practices are for this topic.

Can you refer us to best practices docs, or give us rationale that makes it clear why we should prefer pinned commit hashes over semver tags?

In the meantime, we also observed that most (but not all_ of the actions that we use come from Github itself (actions/...), so those seem trustworthy, so that's why we defaulted to using the semver tag style.

Nov 23 '23 01:11 echeran

Sure thing @echeran!

Here are some docs I've found that suggest hash-pinning:

Note that the Apache Foundation suggests hash-pinning, but makes an exception for GitHub-owned Actions (i.e. actions/*), which may be major-version-pinned.*

As for your analysis, it's basically correct. The only thing that's "missing" is the inclusion of dependabot to update the hashes. These dependabot PRs will ensure you are (basically) always at the most recent commit. With a monthly update schedule, you're just two weeks "late" (on average) to get any newly-released versions.

But I posit this "delay" is actually a good thing: it gives the rest of the ecosystem two weeks (on average) to evaluate the new version for breakages or vulnerabilities before ICU is exposed to it.

If something dangerous exists but isn't detected, dependabot sends the PR, ICU merges it, and the Action is later found to be dangerous. But this delay will have at least reduced ICU's time of exposure to the vulnerability. And if something is found during the delay period, dependabot probably won't even send the PR and ICU won't be at risk at all.

And in the absolute worst case scenario, the new version is released just before dependabot runs to update ICU's Actions. In this case, Dependabot will send the PR and the start of ICU's exposure will be very similar to what it'd've been if the Action were major-version-pinned. However, if you have Dependabot security updates enabled in the repo settings (Settings > Code Security & Analysis), you'll also receive an "out-of-season" dependabot PR fixing the vulnerability as soon as it is discovered (seriously, enable Dependabot security updates regardless of what you choose to do with this PR!).

So, in the absolute worst case scenario, where ICU receives and merges the PR moments after the version is released, hash-pinning admittedly doesn't make much of a difference compared to major-version-pinning:

with major-version-pinning, you're exposed between the moment the malicious version is released and the moment a patched version is released.
with hash-pinning, you're exposed between the moment you merge the routine dependabot PR with the malicious version and the moment you merge the "out-of-season" PR with the patched version.

But in any other scenario, the delay will reduce (or even eliminate) ICU's exposure to the vulnerable Action.

* However, with the advent of grouped updates, where Dependabot sends a single PR for all Actions with new versions, I'd argue that there isn't a radical difference in overhead between hash-pinning a few things and hash-pinning everything.

Nov 23 '23 22:11 pnacht

Hi @pnacht , just getting back to this. I'll update the PR with the latest from upstream because we refactored our workflows in the meantime.

Also, since yesterday, we noticed that we all of a sudden started getting errors on the Scorecard workflow, but the direct cause is very hard to understand -> example. Are those errors perhaps related to the issue of hash pinning the GHA version?

Mar 20 '24 23:03 echeran

Notice: the branch changed across the force-push!

vendor/double-conversion/upstream/.github/workflows/ci.yml is no longer changed in the branch
vendor/double-conversion/upstream/.github/workflows/scons.yml is no longer changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

Mar 20 '24 23:03 jira-pull-request-webhook[bot]

Notice: the branch changed across the force-push!

.github/workflows/icu_common.yml is no longer changed in the branch
.github/workflows/icu_docs.yml is no longer changed in the branch
.github/workflows/icu4j.yml is no longer changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

Mar 21 '24 04:03 jira-pull-request-webhook[bot]

@pnacht Here's what I've found after a little digging: the issue that we've been seeing was recorded in https://github.com/ossf/scorecard-action/issues/997 (duplicated by https://github.com/ossf/scorecard-action/issues/998). The issue says that the problem was fixed in version 2.0.6, whereas we're currently only using version 2. When I tested (in my personal fork) using v 2.1.3, I still had the issue, even though v2.1.3 > v2.0.6. But using the latest version, v2.3.1, the job passes successfully in my personal fork.

So I amended the branch here to use v2.3.1 and we'll expect things to work after the PR is merged. I'll update the upstream issue with the info.

In the worst case, if we still have the same problem even after this amended config gets merged, then our last resort workaround is to temporarily just set publish_results: false in the job config as described in https://github.com/ossf/scorecard-action/issues/998#issuecomment-1296925968.

Mar 21 '24 05:03 echeran

Hey @echeran, sorry for the issues with Scorecard. One of its dependencies for publishing results (Sigstore) changed their trust root (https://blog.sigstore.dev/tuf-root-update/), making it incompatible with older versions of their package. Bumping to the latest version of Scorecard was the right move and should work from now on.

Sorry again for the hassle!

Mar 21 '24 15:03 pnacht