Checks reported in the wrong PR
What happened? What is the problem?
From slack thread.
I don't know exactly what happened in PR 781. Filtering jobs by defined branch seems ok. This was the packit configuration.
Looking at our logs, I see that the wrong checks were submitted (in PR 781) by jobs associated with another PR, the 780 which was created against main soon before 781 and closed.
Comparing the wrong test ids in the link for PR 781 and the test ids for [PR 780] they match.
It seems that packit reported the status in PR 781 for jobs started in PR 780.
As Karel wrote:
I am just thinking if one cannot use it to bypass some "control" mechanisms I mean, 1st opening a MR against a branch where I can expect tests to pass and then 2nd open another MR against a branch where I want my change to land and confuse reviewers
Hello. What I did was basically this:
- Accidentally opened a PR against the
mainbranch - this branch is tested on fedora-all and CSs. - I have closed PR right away after realizing this error
- I have created new PR (using the same source branch) against the
rhel-9-mainbranch. - I could see that jobs for fedora-all and CSs are present, even though only c9s jobs should be there.
- After pushing another commit to the source branch jobs have been rescheduled and correctly limited to c9s.
I'll try to look into this.
I looked into it and my current understanding is this:
A check run is associated with a specific commit, not the PR that the check was triggered in. When Packit reports the result of a check run, it calls the Github REST API, which looks at the sha of the commit and adds the check run to all open PRs, which contain that commit. The check run is attached to that commit, so even when a new PR was opened, 8d22b08 was again the head commit of that PR, but a check run already existed for it from the previous, closed PR.
I'm wondering whether Packit re-ran the two tests for Centos 9 when the second PR was opened, or whether it only updated the check-run results based on checks triggered for the old PR. If Packit actually created a new check run in #781, then there would be two check runs named Packit-as-a-Service present (I believe), but there is only one (the one triggered in the closed PR). So I believe (though I may be wrong) that Packit didn't create a new check run once it saw that a check run was already present.
It's not possible to delete a check run, so we cannot remove the invalid check run from the second PR. If we update a check run, it's going to be updated in all PRs, which contain this check run. I'm thinking what the most elegant solution would be.
My current proposal is as such:
- Before creating a new check run for a commit, check whether a check run called
Packit-as-a-Servicealready exists. - If it does, then that means that the old check run was triggered in a different PR and thus is probably not valid for the current PR, so proceed with the following:
- Update the old check-run. Perhaps change the check run name from
Packit-as-a-Serviceto something like[OUTDATED] Packit-as-a-Service. - Then create a new
Packit-as-a-Servicecheck run, which will be valid for the current PR. - The check marked as outdated could contain a link to the PR, which it was originally triggered in (if possible to find out) and an explanation for the user for clarity.
- Update the old check-run. Perhaps change the check run name from
If Packit detects that a Packit-as-a-Service check run already exists, it could just leave a comment in the current PR informing the user of what happened and explain why there are now two Packit-as-a-Service check runs, one of which is outdated.
That's the best solution that I can think of at the moment. If you have better ideas, please let me know. I will look more into this in case I missed or misunderstood something.
Or instead, if this kind of situation was to occur and there would be multiple Packit check runs associated with the same commit across multiple PRs, I would mark each check run as [PR #0000] Packit-as-a-Service. That would probably be the best solution as it would work even if the user were to have more than two check runs associated with the same commit (each triggered in different PR). Though I am not sure how difficult it would be to find out the number of the PR where the previous check run was created...
UGH…
A check run is associated with a specific commit,
actually thought about this when I read the description of the issue 😁
A check run is associated with a specific commit, not the PR that the check was triggered in.
Technically GitHub has a notion of Check suites, but… this adds way too much complexity to the already complex reporting on the Packit' side. This would require a lot of changes and also keeping track in the database which would eventually create more related bugs. IMO it's not worth it.
If Packit actually created a new check run in #781, then there would be two check runs named Packit-as-a-Service present (I believe)
Excerpt from the docs:
[!NOTE] A GitHub App usually only receives one check_suite event per commit SHA, even if you push the commit SHA to more than one branch. To find out when a commit SHA is pushed to a branch, you can subscribe to branch create events.
Packit-as-a-Service is a name of the app, GitHub groups check runs/suites based on the “provider” (i.e. the app) which brings me to the next point:
Update the old check-run. Perhaps change the check run name from
Packit-as-a-Serviceto something like[OUTDATED] Packit-as-a-Service. I would mark each check run as[PR #0000] Packit-as-a-Service.
You can't do either of those, because it's not allowed.
The easiest “solution” to this “issue” would be prioritising user-side cancellation of jobs from #5, though there are still other priorities in that epic, such as being able to cancel Koji jobs (especially in the light of Packit as a Fedora CI).
To address the initial concern from the issue description:
I am just thinking if one cannot use it to bypass some "control" mechanisms I mean, 1st opening a MR against a branch where I can expect tests to pass and then 2nd open another MR against a branch where I want my change to land and confuse reviewers
I'm not sure how to address this properly, so I'll go piece-by-piece:
- “if one cannot use it to bypass some "control" mechanisms”, depends on whether you mean automation (on GitHub' side) or maintainers; regardless of which one: yes and no
- I'll start with automation:
- require green CI /= require certain checks; GitHub allows requiring specific checks to pass before merging, and it can be even configured per branch, i.e., you definitely can adjust the merging requirements even for complex configuration
- based on the config linked above, there are different targets (chroots) based on the target branches, so I wouldn't say that it's possible to bypass required checks, if you have configured that you require specific checks to pass (check names include, by default, target too)
- running different test suites for the same target depending on the target branch would be a problem; in general the CI runs are tied to the commit, not the PR (only Pagure allowed some sort of hack for setting their “status flag” on the PR itself), given that you certainly can configure Packit to run different test suites based on the target branch, you could somehow cause false negative… unfortunately, we cannot be responsible for that and there's also no way from our side to prevent that
NOTE: This can be easily fixed by using identifiers (that are, by default, included in the check names), and making both the branch protection rules and Packit jobs more explicit
- as for the human error, I'd say it's present at all times 🤷 I'd also expect maintainers to be sensible, though that expectation has already been felled for me…
One way or another, I'd probably adjust the merging requirements to be more explicit to cover this case; only other option is simplifying the test matrix, but I have a feeling that's a no-go for you
Packit-as-a-Serviceis a name of the app
Oh, I didn't realize that. Wouldn't the following be also an option?
- Before creating a new check run for a commit, check whether a check run created by Packit-as-a-Service app already exists.
- If it does:
- Cancel all jobs in this check-run.
- Trigger jobs according to the configuration in this PR.
- Update check run accordingly.
It's not the best solution as it updates the check run in the closed PR as well, but it's still better than just having the old check run info in the new PR...
Oh, I didn't realize that. Wouldn't the following be also an option?
Before creating a new check run for a commit, check whether a check run created by Packit-as-a-Service app already exists.
If it does:
- Cancel all jobs in this check-run.
- Trigger jobs according to the configuration in this PR.
- Update check run accordingly.
It's not the best solution as it updates the check run in the closed PR as well, but it's still better than just having the old check run info in the new PR...
You can open multiple PRs with the same commit, e.g., fast-forwarding in dist-git; regardless whether the other PRs are open, or closed, cancelling the other jobs doesn't remove the checks, they're there, even if you mark them as cancelled. And in case of multiple open PRs with the same HEAD commit, but different builds/tests, you could basically block some of them because of this cancelling.
Now that I think about it, I'd rather not mention how many times I got annoyed, because of the merge commit I couldn't look up the last tests / RPM builds… The fact the checks are tied to the commit probably stems from this too, i.e., you can check the CI results after the merge.
One way or another, we're trying to workaround git-forge design, whatever is the actionable outcome from this issue, it will cause pain further down the line…
Alright, shall I move this back to backlog then, since no good solution to this was found?