arcade icon indicating copy to clipboard operation
arcade copied to clipboard

GitHub sometimes doesn't run CI when commits are pushed to a PR

Open RikkiGibson opened this issue 2 years ago • 4 comments

We found that dotnet/roslyn#62364 didn't kick-off a CI run on the last commit that was pushed to the PR. Instead it just kept the check results for the second-to-last commit. The last commit broke the build and we didn't find out until the PR was merged.

We'd like to track this on an ongoing basis and record any further instances of this problem in this issue.

RikkiGibson avatar Jul 21 '22 21:07 RikkiGibson

@RikkiGibson @davidwengier Did it really not run at all? I thought it was run, but it pulled the second-to-last commit?

Youssef1313 avatar Jul 22 '22 13:07 Youssef1313

This is the last commit run (the bad run): https://dev.azure.com/dnceng/public/_build/results?buildId=1892504&view=logs&j=5e4c1a10-89a7-5bf1-0765-c10b6cf6bdb8

And this is the second-to-last commit run: https://dev.azure.com/dnceng/public/_build/results?buildId=1884883&view=logs&j=b2bb0dc1-9cbd-51f0-af1d-56de48e14afe

The checked out commits are https://github.com/dotnet/roslyn/commit/1d8827d0327d0c0bb13e99ec84bbc89cc3d1f4ca and https://github.com/dotnet/roslyn/commit/619c9e6d35a541518f91fc462b2a465323d99c5e

Both are merges of the second-to-last-commit https://github.com/dotnet/roslyn/commit/f41541cc8aec963822e31e02f3f5c06ec8dd11c2

Youssef1313 avatar Jul 22 '22 13:07 Youssef1313

@RikkiGibson, is there any action for us to take on this issue? It has been open for a while and no data has been added for ~20 days.

ilyas1974 avatar Aug 12 '22 19:08 ilyas1974

I don't think there is.

RikkiGibson avatar Aug 12 '22 20:08 RikkiGibson

I think this has just happened again in https://github.com/dotnet/roslyn/pull/63398

  1. a119633 and 5d6915c were pushed to the PR (both at once, but 5d6915c is the last commit).
  2. CI ran on 97fd987, but ended up associating the checks with 5d6915c (going to the checks for 5d6915c brings you to this build https://dev.azure.com/dnceng/public/_build/results?buildId=1943981&view=results. This build is associated with this merge commit https://github.com/dotnet/roslyn/commit/ba8c8b5616051ac1f235b1b67c64dd562ce66f17 which does not have the right change to SolutionServices.cs SolutionServices.cs from 5d6915c
  3. CI failed because 97fd987 stack overflows. However it should have ran on 5d6915c which does not stack overflow.

dibarbet avatar Aug 15 '22 21:08 dibarbet

@dibarbet

I think it ran on 97fd9879915a25909a68a5ee7ea0b2db014e5f78 (the commit before a119633)

Youssef1313 avatar Aug 15 '22 22:08 Youssef1313

@dibarbet

I think it ran on 97fd9879915a25909a68a5ee7ea0b2db014e5f78 (the commit before a119633)

you're totally right, updated my comment.

dibarbet avatar Aug 15 '22 22:08 dibarbet

Again on https://github.com/dotnet/roslyn/pull/63408

f5356c9 showed checks from 4e6b2df (failed with an error fixed in f5356c9)

dibarbet avatar Aug 29 '22 21:08 dibarbet

@ilyas1974 Can someone look at this? It seems pretty bad, especially as it means that CI could pass/fail on something different than the actual code change requested.

CyrusNajmabadi avatar Aug 29 '22 21:08 CyrusNajmabadi

This feels like a potential security bug.

CyrusNajmabadi avatar Aug 29 '22 21:08 CyrusNajmabadi

@adiaaida is working on opening an ICM with Azure DevOps to see if they can help with this issue.

ilyas1974 avatar Aug 29 '22 22:08 ilyas1974

I believe this to be an issue between github and azdo (github not accurately informing azdo that a run needs to be started on the latest commit). I have opened https://portal.microsofticm.com/imp/v3/incidents/details/331322707 with AzDO to hopefully get some traction on the issue.

michellemcdaniel avatar Aug 29 '22 22:08 michellemcdaniel

Hi there everyone, the AzDO team looked into this, and this is what they found:

A bit of context on what ADO is doing: when ADO receive pull_request event, we are trying to retrieve PR merge SHA from GH. In case if it's not ready yet a new job is scheduled to repeat the same request in a couple of seconds. For the case you have there been two close pushes to the same PR and when first pull_reqeust update event landed to ADO servers we did request to get merge SHA from the GH -> merge was not ready so we reschedule job to repeat it after some time. The job runs and requested PR info, request return merge commit SHA and we used it for the build, the same happened for second pull_request update event. So we have two builds with the same merge SHA. For both events, get PR information return the same merge SHA, I would expect GH to return different merge SHA's or merge SHA for the second push for both requests.

From ADO perspective everything was fine, two builds were triggered with the same source version because get pull_request info request return same merge SHA in both cases.

It's possible this is a github issue, rather than an AzDO issue, since builds are be triggered for each commit.

michellemcdaniel avatar Sep 27 '22 16:09 michellemcdaniel

As there is no additional action for dnceng to take at this time, I am closing this issue.

ilyas1974 avatar Oct 21 '22 16:10 ilyas1974