readthedocs.org icon indicating copy to clipboard operation
readthedocs.org copied to clipboard

Consider using the `merge` URL on PR builds

Open ericholscher opened this issue 4 years ago • 12 comments

This is from a user who did a bunch of research. I'm putting it here so we don't lose it:

Response

Even though it's common in CI integrations to use the merge ref, this feature is awfully underdocumented. The best I could find is:

"In order to determine the mergeability of a pull request, GitHub actually performs a merge between the head and the base ref, and the result of that merge is stored in the /merge ref"

https://discourse.drone.io/t/github-claims-that-merge-refs-are-undocumented-feature/1100

The whole initial post is actually worth reading, as it also mentions that this is used by Travis and CircleCI.

I know Azure Pipelines does the same (slightly documented at https://docs.microsoft.com/en-us/azure/devops/pipelines/repos/github?view=azure-devops&tabs=yaml#pr-triggers, "GitHub creates a new ref when a pull request is created. The ref points to a merge commit, which is the merged code between the source and target branches of the pull request.")

And GitHub Actions does the same as well (which is not surprising, as this code is most likely forked from Azure Pipelines after Microsoft took over). Have a look at this example:

https://github.com/opensocdebug/osd-sw/pull/44/checks?check_run_id=545686200 for example, look at the "Run actions/checkout@v2" step):

/usr/bin/git checkout --progress --force refs/remotes/pull/44/merge

Again, there's not much documentation, beyond their API docs, which indicates that they use the merge ref: https://help.github.com/en/actions/reference/events-that-trigger-workflows#pull-request-event-pull_request (and a short note indicating how to change that behavior at https://github.com/actions/checkout/blob/574281d34cf49767d4b75b691c4c4f4655e0c93f/README.md#checkout-pull-request-head-c…).

There is also some discussion about the merge ref not always being available -- I've never seen that happen in CI systems. When the webhook is delivered the merge ref is always present (in my experience).

So sorry for being able to give you only clues, but that's the best I could find (and I did look for that already a year back for another project).

ericholscher avatar Apr 23 '20 21:04 ericholscher

Cc @imphil.

eric-wieser avatar Apr 26 '20 07:04 eric-wieser

An alternative: hide it behind a feature flag

eric-wieser avatar Apr 26 '20 07:04 eric-wieser

Using /merge has one good thing:

  • you are 100% sure that after merging the branch it won't break at all

and one bad thing:

  • if there is conflicts in the PR, it won't trigger the build

An alternative: hide it behind a feature flag

Instead of a feature flag, I'd say that this could be a per-project config, so user can decide what's the best scenario for them.

humitos avatar May 08 '20 10:05 humitos

IMO, the main benefit of using /merge is consistency with other CI solutions: if you have multiple checks under a PR which report results, it is confusing if those checks are actually run against different code. Since most (all?) commonly used CI solutions (e.g. Travis, GH Actions, Azure Pipelines, etc.) chose to use /merge [1], doing the same for the RTD check would give us that consistency.

[1] Sometimes that behavior can be configured, but I haven't come across many projects which use something other than the default there.

imphil avatar May 08 '20 10:05 imphil

Since most (all?)

CircleCI is an exception, which is often complained about:

  • https://ideas.circleci.com/ideas/CCI-I-431 (has a long list of links)
  • https://ideas.circleci.com/ideas/CCI-I-926

eric-wieser avatar May 08 '20 10:05 eric-wieser

  • you are 100% sure that after merging the branch it won't break at all

With the caveat, if you haven't pushed anything else to the target branch.

eric-wieser avatar May 08 '20 10:05 eric-wieser

Right -- it seems this approach is invalidated immediately once something has been pushed to the master branch after the build has happened. It also means that two builds of the same commit could have different outcomes based on the time the build was run. I see the value in this approach, but it also seems fundamentally broken and not how I modeled CI working (eg. build this commit, not build this commit merged against some arbitrary commit on another branch based on the time I pushed it).

ericholscher avatar May 08 '20 14:05 ericholscher

Right -- it seems this approach is invalidated immediately once something has been pushed to the master branch after the build has happened.

Invalidated seems like a strong word. Its no less valid than the build of the head was, but if you keep track of what else you merged, you at least know exactly how invalid it might be. You can always retrigger the build after merging.


It also means that two builds of the same commit could have different outcomes based on the time the build was run.

That's just a labelling choice. Label builds with the SHA1 of the merge commits rather than the head commit, and you have a unique commit per build again. That's what travis does.

This is actually a feature in some ways (from the circleCI discussion):

It seems like the current way of building commits instead of merge commits for PRs means that if you have two PRs open from the same head branch but that target different base branches, they will share the same build. That is fundamentally wrong. If I need to merge a hotfix into both the develop and production branch, I need to verify that that change will not break anything on either of the two base branches. The diff for each PR can look very different, yet they share the same build. Instead, we need one build for each PR.


I see the value in this approach, but it also seems fundamentally broken and not how I modeled CI working (eg. build this commit, not build this commit merged against some arbitrary commit on another branch based on the time I pushed it).

To me, the purpose of CI is to tell you whether merging your branch will break things. The choice is between:

  1. Build the pull head, a commit which may look almost entirely different to the result of merging. Knowledge of whether the build is representative requires the reviewer to very carefully work out what has changed on the target branch since the base commit. This could be a decade worth of commits if the change is in a rarely touched file, and github doesn't even tell users the base commit!

  2. Use a snapshot of the target branch head merged with the pull head. At the time the PR is created, this represents exactly what the result of merging is. If new commits are merged to the target branch, then the reviewer can either:

    • Think about the commits they remember merging in the last days (rather than year), and decide if they might invalidate the success of a build.
    • Decide the time range is too long, and just retrigger CI. Now the CI build once again represents exactly the result of merging.

I don't see any situation in which 1 is the better behavior for a reviewer.

eric-wieser avatar May 08 '20 14:05 eric-wieser

I don't see any situation in which 1 is the better behavior for a reviewer.

This must be a polarizing issue: I engaged in the same thought process but came to the opposite conclusion.

I always configure CI systems to build the pull head (/head). If the build fails, the developer needs to be able to understand, find, and fix the cause. That means the cause of failure must be present in the code that the developer has — not in some ephemeral tentative merge ref. The developer should be able to recreate the failure in their own development environment, write the fix, push, and then see the build go green in the CI system.

Put another, simpler way: why build anything other than the code the developer wrote?

One can obtain the benefit of building /merge simply by requiring the PR to be up-to-date with the target branch before merging. When the PR is up-to-date, /head and /merge are identical, and thus the CI build tests exactly what the merge result would be. Additionally, the PR was brought up to date by conscious action: by the reviewer clicking the Update branch button, or by the developer invoking git merge target-branch, for example. That person then expects a possible new integration bug, and the /head ref is already prepared to receive whatever work is required to fix the bug.

To use a sportsball analogy:

  • /merge — a randomly moving goalpost, shrouded in fog
  • /head — periodically blowing the fog away and taking a precision bearing on the goalpost

EDIT: Changed /pull to /head.

sharpjs avatar Jul 08 '20 00:07 sharpjs

Put another, simpler way: why build anything other than the code the developer wrote?

Because the base branch will potentially be broken if the merge is not tested. And since it's a CI-like integration, it's expected that it can also be used as a PR gating check.

webknjaz avatar Jan 08 '21 13:01 webknjaz

To use a sportsball analogy:

  • /merge — a randomly moving goalpost, shrouded in fog
  • /pull — periodically blowing the fog away and taking a precision bearing on the goalpost

The analogy here is poor. /merge is briefly blowing the fog away and taking a precision bearing at the point when you shoot with the sportball. /pull is relying on your photographic memory of where the goalpost was when you were first recruited to your team. Aiming for where the goalpost used to be isn't useful if its not there any more.

eric-wieser avatar Jan 08 '21 13:01 eric-wieser

Aiming for where the goalpost used to be isn't useful if its not there any more.

TBH /merge is updated asynchronously so it sometimes has a similar problem but this is beside the point.

webknjaz avatar Jan 08 '21 13:01 webknjaz