skip-duplicate-actions Action on pull_request trigger is skipped even though base branch content changed

I have the following workflow defined, which enables concurrent skipping and then simply echoes whether a run has been skipped or not.

name: CI

on:
  push:
  pull_request:

jobs:
  find-duplicate-workflows:
    runs-on: [ self-hosted ]
    steps:
      - id: skip-check
        uses: fkirc/[email protected]
        with:
          concurrent_skipping: "same_content_newer"
          skip_after_successful_duplicate: "true"
      - run: echo ${{ steps.skip-check.outputs.should_skip }}

Now I add a new file that only exists in the main branch, and an additional file that only exists in a pr branch. Now I create a pull request to merge pr into main. The push triggers for both branches are ran as expected.

The pull_request triggers must not be skipped however, since the merge will contain both files, so the checked out contents differ from both main and pr. skip-duplicate-actions however reports should-skip = true.

See our test repo for a reproduction. Runs 6 and 11 are from push events, and 12 is from a pull_request trigger concurrent with its sibling push.

Sep 01 '22 09:09 fknorr

Thank you for the detailed report, @fknorr!

As you can see in the logs of the pull_request workflow run (#12), skip-duplicate-actions has detected that the exact same files (same tree hash) have already been checked in the push workflow run (#11) and therefore reports should-skip = true. Apparently, the push workflow run was performed a little before the pull_request workflow run. Please note that both workflow runs, pull_request and push, are performed against the same ref (the merge commit). So, this is working as intended.

However, once the pull request (https://github.com/celerity/test-skip-duplicate-actions/pull/2) is merged into the main branch, skip-duplicate-actions should report should-skip = false because of the added commit in the main branch.

Hope this helps!

Sep 12 '22 09:09 paescuj

Thanks for taking a look at this @paescuj. I have updated the test setup to report in detail which revisions are checked out and what files are present (also I have deleted all old runs so there's no unexpected interactions).

The actions script now runs

git rev-parse HEAD
ls
echo ${{should_skip}}

push trigger #19 runs for the main branch and reports

fb314dc3155556864a8d5d3fc063e0372cd0b582
file-on-main.txt
should_skip=false

push trigger #20 runs for the un-merged PR branch and reports

59164db4e10d9f17a313bd5b98e0a5c3e19c42e0
file-in-pr.txt
should_skip=false

pull_request trigger #21 runs for the implicit PR merge commit and reports

bd633cec460a9c1af3247d70fca45942903b2baa
file-in-pr.txt
file-on-main.txt
should_skip=true

The skip-duplicate-actions logs state that this run can be skipped since the same files have been checked in # 20, but that's not actually true. From my understanding push triggers always run on the unmerged branch, and pull_request triggers check GitHub's implicit merge commit.

Sep 12 '22 14:09 fknorr

From my understanding push triggers always run on the unmerged branch, and pull_request triggers check GitHub's implicit merge commit.

Of course, you're totally right! I must have been a bit tired and got things mixed up... 😄🙈 Indeed, the behavior of skip-duplicate-actions seems to be slightly wrong in such scenarios. Thank you very much for recognizing this! I'll come up with a solution in the next days and report back here.

Sep 13 '22 13:09 paescuj

Unfortunately, I don't see any way to get the tree SHA from the merge commit for all workflow runs that were triggered by the pull request event. (See #270 for an unsuccessful attempt) So as a conclusion:

[ ] When I do find the time, I'll demand GitHub to include this information in the Workflow runs API.
[ ] Document that the current checks of skip-duplicate-actions are slightly wrong on pull_request events and when there are changes in the base branch.

Sep 22 '22 07:09 paescuj

Thanks for giving this a try!

I have looked into what actions/checkout does on pull requests, since it also needs to figure out the merge SHA on a pull_request trigger. Its Readme talks about $GITHUB_SHA and $GITHUB_REF, and indeed the workflow documentation suggests that this information is passed to jobs in environment variables for any event type.

From this it seems like the Action should not rely on the API at all for figuring out what commit to run on but do a getenv instead.

Edit: Oh well, this only helps to get the commit SHA in the current job of course, not the "remote ones" (at least not without some log/communication hackery).

Sep 22 '22 09:09 fknorr

(at least not without some log/communication hackery).

One such hack could be to upload the SHA as a workflow artifact. Ugly but might just work..?

Sep 22 '22 10:09 psalz

Edit: Oh well, this only helps to get the commit SHA in the current job of course, not the "remote ones" (at least not without some log/communication hackery).

Exactly 😄

One such hack could be to upload the SHA as a workflow artifact. Ugly but might just work..?

Yeah, I was thinking about something like this. One "problem" with GitHub Actions is that you always need to keep an eye on the amount of API requests you're doing, otherwise you risk to run into the rate limit. Possibly, we could upload a map with the SHAs from the last 100 runs in each workflow run and then just fetch it from the last run, respectively. Another concern is the execution time of skip-duplicate-actions which I'd like to keep as low as possible. I might give this a try when I find some time...

Sep 22 '22 13:09 paescuj

A conservative solution could be to identify the cases we definitely know are duplicates, and keep the jobs for which we cannot be sure.

In the current run, we always know the true "checkout SHA" from the $GITHUB_SHA.
For API queries on other concurrent / finished jobs, we only consider those that were not triggered on pull_request, because we know that for non-pull_requests the "API SHA" and "checkout SHA" will match.

That means we can actively skip a pull_request trigger if we find a matching push trigger, but not the other way round. I believe the remaining triggers would also behave like pushes in that sense.

Sep 26 '22 09:09 fknorr

skip-duplicate-actions skip-duplicate-actions copied to clipboard

Action on pull_request trigger is skipped even though base branch content changed

skip-duplicate-actions
skip-duplicate-actions copied to clipboard