create-pull-request Not able to properly manage long living branches updated from main that each have their own disparate changesets (unique customizations)

Subject of the issue

related to #1197 but not quite.

We are trying to use this PR tool to help maintain long living branches for our gitops repo. I've debated many times on whether or not to open this, but I can consistently reproduce this behavior in our internal live repo, and very recently in a repo I've made public to share in this issue. It also makes me batty because if I run the below described steps locally on my mac, I get an entirely different result and changeset. Its as if the github action is using a different git or something.

All of our shared code lives in main. We branch from main for our feature and hotfix work, and iterate in our feature until ready to merge in to main.

We have environment branches develop stage production that are long living and each with their own changesets that can be applied directly to the respective branch with or without a PR.

Each environment branch contains common changesets that originate from the main branch, however in each environment branch there are changesets that are unique to those branches and they may or may not exist in the other environment branches. We can maintain the unique changesets in the respective branches using --ours if using rebase or --theirs if using merge. We have no issue here, as we put those unique files in a single location in the repo, and use .gitattributes to maintain that setting for us. .gitattributes: (this is assuming we're using merge like in the github runner script pasted at the end of this issue)

kubernetes/some/unique/path/for/our/remote/clusters/* merge=ours
kubernetes/some/unique/path/for/our/mgmt/clusters/* merge=ours
kubernetes/some/unique/path/for/our/mgmt/flux-system/mesh/external-kustomizations/*.yaml merge=ours

Here is where it gets weird. More often then when files are removed or renamed or something on main, these changes if showing up in the PR actually disappear. The changeset shows that the file was deleted but it is never deleted. It gets even more weird.

Sometimes for branches that might not get pushed to as often as others, there seems to be weird merging issues that happen, where the in the PR it shows file contents being updated of one file, but those contents are from another file in the same repo. I do not know how this happens. I can only surmise that it has to do with the two-dot notation vs three-dot notation that github uses.

Steps to reproduce

The workflows i'm trying to use are here here and here.

It should be noted that the linked workflows above are using git rebase but I've also experienced this with git merge too.

If you get a few commits deep on the main branch before actually merging the "synced" branch in things seem to get weird with the github action, and more weird as time goes on, in addition to the commits stacking up on the PRs getting messy, the diffs get really weird. However, as odd as it sounds, using the same steps in a script (below) locally on my macbook that is used on the github runner and I can get the right changeset and actually clean up the weird commit history.

I have even gone as far as to compare the diff in the PR created by me manually from a local run, to a diff in a PR created by the Github Action run similar to the one linked above, and they're very different. I just do not know what is making it behave differently than I expect. Please let me know if I can give any other details here.

Below is the script that I'm using with our live repo that is using merge and experiences the same symptoms as described above... below <..> is used for brevity, I'm basically trying to match on the std out and having git do different things with the files it encounters. I would ideally not do this if I could get the git in Github Actions to behave consistently like the local version on our macbooks.

script:

git config core.ignorecase true
git config --global user.email "[email protected]"
git config --global user.name "bot-net"
git checkout $baseBranch
git fetch
git pull origin $baseBranch
git checkout $targetBranch
git fetch
git pull origin $targetBranch
MERGE=$(git merge -Xtheirs $baseBranch --no-edit 2>&1)
  if [[  "$MERGE_STATUS" =~ "<..>" ]]; then
      while IFS= read -r line ;
         do
           if [[ $line == *"<..>" * ]]; then
             rm -f $line
           fi
         done <<< $(<somegit command)
  fi
  FINAL_STATUS=$(git status 2>&1)
  if [[  "$FINAL_STATUS" =~ "All conflicts fixed but you are still merging" ]]; then
    git commit -am "Merge branch '$baseBranch' into $targetBranch"
  fi

Thanks for any insight one can provide. If this is intended behavior please let me know and I can close this ticket. TIA

Ideally we would like the PR to show the correct changeset, and actually update the files in the same manner as the git binary would on our local machines. Lastly for some reason the PR wants to include every commit since the last common ancestor between main and the long living branch.

I do see that this just uses the git binary found in the environment path here and here. To that end I am using the ubuntu latest runner image... but I'm not 100% the difference in git version and environment is to blame here.. 🤔

Jun 26 '22 00:06 caleygoff-invitae

Hi @caleygoff-invitae

I've unsure as to exactly what the problem is that you are experiencing. Are you saying that it raises a PR, but the diff is not what you expect? Does the PR branch have the commits that you expect?

Let me try and explain a bit how the action works, which might help you solve the problem. Essentially what you want to do for this use case is make your own commits before the action runs. Always checkout the branch that is the target of the pull request, which is your case is production, develop, etc. In the actions workflow step, make commits directly to the checked-out branch, but do not push. When the action starts, git should be checked out locally with unpushed commits, or uncommitted changes. In your case it will be unpushed commits directly on top of production.

When the action starts, it creates a temporary branch of exactly what it finds locally. So that temporary branch will be production, plus the extra commits that have been made in the previous workflow step. The action will then use this temporary branch as the pull request branch.

So the easiest way to debug problems like this is to inspect what git looks like locally just before the action runs. That series of commits will be exactly what the new pull request will look like. If the diff is wrong, it means that the commits are wrong somehow.

I recommend performing these commands locally and checking if the commits are what you expect for the new pull request. You can also push to a new branch, raise a PR and check if the diff is what you expect.

          git config --global user.email "[email protected]"
          git config --global user.name "Bot Test"
          git config pull.rebase true
          git fetch origin main:main
          git rebase --merge -X ours main
          git pull

Let me know if I've misunderstood the problem. I don't think the action is doing anything strange. If you give it unpushed commits locally, those commits will be in the new PR without modification.

Jun 27 '22 02:06 peter-evans

Hi @peter-evans thanks for the response!

I've unsure as to exactly what the problem is that you are experiencing. Are you saying that it raises a PR, but the diff is not what you expect? Does the PR branch have the commits that you expect?

This is correct, when the PR is raised by the create-pull-request, the diff I expect is not what is there but the commits I want to see are there in the commit history (after some time and many many commits later, it also lists all previously merged commits too which is weird). More to the point, and I think the point you make later, when you suggest to run the commands locally and see if what I expect is there. Is that when I actually do run the same script locally on my macbook, the diff I would expect after pushing the result of that local run to origin, and making the PR manually is not only correct, but not at all what is in the incorrect diff that is shown in the PR that is generated by this action script.

some subtext here: I've been dealing with this for a few months now, and in order to reset whatever "state" this action script is using for each "target" branch I'm checking out -- I'll run the script used in my "github action" to bring the repo correct, and for a while the action script will create PRs that I expect.

So the easiest way to debug problems like this is to inspect what git looks like locally just before the action runs. That series of commits will be exactly what the new pull request will look like. If the diff is wrong, it means that the commits are wrong somehow.

I agree with you there that the easiest way to see what is up, is by running the commands locally... which I do but it seems this PR just does away with rebase and pull that was previously made in the local branch when ran in the github action vs what the diff looks like in the manual run from my laptop like described in the above subtext.

If the diff is wrong, it means that the commits are wrong somehow.

I guess I do not understand this fully; when I run this script locally, the commits are correct, I do not fully grasp why this PR tool things the commits are in correct

I recommend performing these commands locally and checking if the commits are what you expect for the new pull request. You can also push to a new branch, raise a PR and check if the diff is what you expect.

I do, I have to do this every few weeks to "reset" the long running branches to the PR. Not only does it reset the "commit" history that this PR action for some reason needs to collect along the way, but I end up having to run the same script the workflow is using before the PR is made but on my machine, which again works fine on my machine and the diff is what I would expect.

here is a harmless screenshot from a PR generated recently that I can't promote, in our live repo where the plugin has done something weird.

Notice file name 1.12.8/ConfigMap-istio-1-12-8.yaml, this is supposed to be for version 1.12.8, the file name is correct, which we want to keep this file, it is coming from main and being rebased (or merged) onto develop, and I would like to open a PR to review it. But for some reason ,in addition to 86 older commits more than 3 months old which have been merged into develop already over that same time and shouldn't really be showing up in this PR, also in this diff screenshot it is actually changing the version to 1-11-8 which I do not know why. In this PR, there are some recently edited files that show up and are correct, but there are changsets in the diff to much older files that have changes that are already on the branch, it seems to undo changes that we would prefer to keep. It very confusing.

The kicker, is I can create the same PR manually, after a run that I did on my machine, with the same script (fresh repo everything) and I do not get this at all. Instead it is the correct diff with only the changes I expect...

I feel like I sound crazy trying to explain this lol, I'm sorry for the troubles. When the offices open in the morning here stateside I'll have my colleagues look with me, but I swear I'm not crazy lmao 😕 . I am seeing clearly two different behaviors

it works fine when using the steps in the workflow action on my laptop, I get the correct diff, and commits in the PR I would expect, which was manually made by a human after pushing the changes from local to origin
when using the script that worked locally on my laptop in a github action workflow which , uses the create-pull-request action script plugin as the last step, the resulting PR is very different.

NOTE: that this only seems to become a problem after time has passed and main has moved forward in a manner that seems to confuse the plugin, causing significant drift betweens say for example main -> develop... I think to support this theory more, I do not see this issue really in our ci branches, where those branches are maintained on push to main which are kept uptodate.... these workflows also use the same script above too. Screen Shot 2022-06-26 at 10 00 51 PM

Jun 27 '22 03:06 caleygoff-invitae

Hi @caleygoff-invitae

I do, I have to do this every few weeks to "reset" the long running branches to the PR. Not only does it reset the "commit" history that this PR action for some reason needs to collect along the way, but I end up having to run the same script the workflow is using before the PR is made but on my machine, which again works fine on my machine and the diff is what I would expect.

What is confusing me about this is that the action doesn't maintain any "history," or use the existing pull request branch commits. The action is designed purposely to be idempotent, meaning that each time it runs the outcome is identical. Basically, each time it runs it's running as if it was the first time it's ever run. If it happens to find that there is an existing pull request branch with the same name, it force updates it. None of the commits in earlier runs are reused.

The "history" you are seeing is just GitHub's event history telling you that the pull request branch was force updated. If you were to close the PR and let the action create it again, it would be the same set of commits and diff.

Jun 27 '22 04:06 peter-evans

The "history" you are seeing is just GitHub's event history telling you that the pull request branch was force updated. If you were to close the PR and let the action create it again, it would be the same set of commits and diff.

Everything you're saying makes sense to me. When I run the test locally. I do the following: I delete the working directory. Clone into the repo git checkout main git fetch git pull git checkout develop git fetch git pull ---> I can either run git merge -Xtheirs main or git rebase -Xours main here

Then because I am still on develop at this point I can make a new branch called sync_develop_with_main and push that to origin.

If it happens to find that there is an existing pull request branch with the same name, it force updates it. None of the commits in earlier runs are reused.

Continuing where I left off before, but in response to you here. I guess this might be where it's messing up. I'll confirm this, this morning, but the action script does make a branch called sync_develop_with_main when it completes. I do have delete: true but really all that does is just give us a delete button after the PR is merged. I can say confidently that we don't delete the associated branch each time the PR is merged completely.

The action is designed purposely to be idempotent, meaning that each time it runs the outcome is identical. Basically, each time it runs it's running as if it was the first time it's ever run.

I wonder if based on the things you've said that if the PR that is opened by this plugin sees either

the branch it's currently maintaining still on origin and just uses that commit and diff history in the associated PR and decides to just stack on top of it.
sees that there are old and closed PRs with the branch still there on origin so continues to add changes to it.

Is there a way to force delete the "promotion" branch post PR merge to keep the idempotency of this action? Or am I mis understanding your assertion?

Thanks again for your replies and I do appreciate the work you and others have put in here. Cheers and Happy Monday.

Jun 27 '22 12:06 caleygoff-invitae

Is there a way to force delete the "promotion" branch post PR merge to keep the idempotency of this action? Or am I mis understanding your assertion?

Sorry for the delay in replying. The action will always preform in an idempotent way, regardless of an existing PR branch with the same name. If it exists, it just ignores the content and force updates it. So after you close the PR, it doesn't matter if the branch is not deleted. Next time the action runs it will overwrite the content of the branch if there is a new diff and a PR to create.

Aug 17 '22 03:08 peter-evans

create-pull-request create-pull-request copied to clipboard

Not able to properly manage long living branches updated from main that each have their own disparate changesets (unique customizations)

Subject of the issue

Steps to reproduce

create-pull-request
create-pull-request copied to clipboard