arcade-services icon indicating copy to clipboard operation
arcade-services copied to clipboard

Do not open codeflow PRs without meaningful changes

Open premun opened this issue 1 year ago • 6 comments

Context

We should prevent endless loops of updates between the VMR and the repo. We should still update package dependencies though.

Goal

  • Figure out the definition of a meaningful update.
  • Do not open new PRs without meaningful updates.
  • For already opened PRs, we can either flow always, even if just to bump a VMR sha in the product repo or we can pause updates so that the PR is mergeable.

premun avatar Nov 21 '24 15:11 premun

@mmitche you mentioned that we should implement not flowing builds without changes.

I think these are the 2 main scenarios where flowing might not be necessary:

Forward flows

Repo only has V.D.xml changes inside. I guess these are not necessary to be flown? Should we just cloak V.D.xml and Versions.props from the VMR?

Backflows

Flows with no source code changes? This is problematic a little since every commit of the VMR has packages associated with it, namely Arcade etc. So this means that every build will most likely have a package update for the repo. If we detect there's no source code changes to flow but the repo might be waiting for new packages, how would we go about this? Would the backflow no-op and if there is indeed a desire to update something, people would do some manual update-dependencies?

premun avatar Dec 16 '24 17:12 premun

Backflows

I think backflows always happen. We just want to avoid breaking the cycle at one point and I think forward flows are the place to do this. Breaking on back flows means that a repo isn't validating against the latest set of assets produced by the VMR.

Forward flows

Repo only has V.D.xml changes inside. I guess these are not necessary to be flown? Should we just cloak V.D.xml and Versions.props from the VMR

I think the cleanest way probably is to cloak. But there are challenges and so it will be difficult. Realistically, you need to cloak:

  • global.json
  • Versions.props
  • Version.Details.xml
  • eng/common/*

Versions.props and V.D.xml also contain changes that come from outside the VMR. For instance, let's say it's runtime-assets flowing into runtime, or an update to Microsoft.Identity.Client in Versions.props. Those can't be wholesale cloaked. Same goes for global.json, which will contain a combination of changes coming from a backflow as well as changes coming from real repo changes. The fact that global.json is involved means you can't split the files into VMR and non-VMR updates. To achieve cloaking you'd need to use a single global.json for the entire VMR, then split V.D.xml and Versions.props into 4 files. Version.External.Details.xml and Versions.External.props and Versions.props, and Version.Details.xml.

Alternatively, we could try and identify the set of diffs which were caused by backflow and exclude them. One way may be as follows: Let's say you parse the diffs as if they were version files, somehow, and understand that is dependency X.Y.Z being updated to 1.2.3, and the arcade sdk updated to 4.5.6. If you can then determine that those updates came from VMR builds, then maybe you can skip the update.

I think the main challenge is going to be determining what is in fact a 'useless update' since that is a function of both the update itself, as well as the current state of the VMR.

Either way, I want to avoid requiring manual local actions if a forward or backflow needs to be forced. A "force" switch should be available when triggering subscriptions.

mmitche avatar Dec 16 '24 17:12 mmitche

The frequency could be cut down in not so lively repos on the subscription and even the most busy ones with EveryBuild won't get as many updates per day. We will build what 2-3x per day? And maybe we would do the update pauses when a PR is green so it's mergeable (or have the "please don't update" labels)? So maybe having the daily/weekly frequency will be advised and solve this problem mostly and we are prematurely optimizing here?

Alternatively, we could try and identify the set of diffs which were caused by backflow and exclude them.

This sounds super difficult and I have a hunch we would soon discover a case when this is not possible. It will clash with other edits to the files (comments, re-ordering..). I was thinking of having a separate approach for version files but I don't have that implemented yet even. Overall the fact we are not synchronizing 2 git repos but 2 git repos + throwing new edits in this coming from the outside is super complicated and we already have cases where we will need to deal with this separately - flowing 2 updates into the same backflow PR on top of each other is problematic because the second patch will find the files altered with the package numbers from the VMR and not with the content from the VMR. Adding more into this mix sounds too complicated.

Splitting the files sounds more manageable. Additionally, there were already asks to separate Versions.props files (not by VMR-externality but by "not/updated by darc").

premun avatar Dec 17 '24 08:12 premun

Frequency can help. For this problem, I'm not generally thinking about main. That can have slower backflow rates and be fine. I'm mainly thinking about release branches. They have to flow as fast as possible, but we want to reach a steady state where it does not require manual interaction to stop the build. Maybe this is less of a concern if every build is coherent? But maybe not.

Splitting files is cleaner, but I'm skeptical it's possible. There are some files (global.json) that we can't split.

mmitche avatar Dec 17 '24 15:12 mmitche

@mmitche I am thinking of taking this one out of the main deliverable for //Build and making it an epic/part of the refinement that will come after the main deliverable (so effectively Summer).

My reasoning:

  • I believe we are fine with just disabling subscriptions for a given preview when it's out so there's no harm delivering this before servicing, so we can easily prevent endless spins on top of servicing branches.
  • Adding support to darc, PCS and code flow and then splitting these files in the product repos sounds like a challenge so this might be too much randomization at the moment. I think we should also possibly see a simpler solution once we implement the whole code flow and this might not even be necessary.

WDYT?

premun avatar Jan 13 '25 12:01 premun

I do think we can manage subscription flow in previews. I think it is crucial to have it for GA, because stability when doing servicing builds has always been pretty important. I am fine moving this out to summer.

mmitche avatar Jan 13 '25 16:01 mmitche

Example of a forward flow PR that could be a good example of what would be a meaningless change: https://github.com/dotnet/dotnet/pull/837

Another one: https://github.com/dotnet/dotnet/pull/1101

premun avatar May 28 '25 07:05 premun

Does not seem to 100% work: https://github.com/dotnet/dotnet/pull/1147 or https://github.com/dotnet/dotnet/pull/1139

premun avatar Jun 13 '25 08:06 premun

I verified in the logs and we're preventing several PRs per day from opening now

premun avatar Jun 18 '25 08:06 premun

Okay, it seems to work fine. The suspected PRs actually had other changes in them (other package bumps)

premun avatar Jun 26 '25 10:06 premun