arcade-services icon indicating copy to clipboard operation
arcade-services copied to clipboard

Add VMR change validation

Open premun opened this issue 2 years ago • 1 comments

Context

When changes are made to the VMR, the PR build should do additional validation as some changes are not permitted.

Examples:

  • Changes to inlined submodules must not be permitted.
  • Changes to files generated by the synchronization tooling must not be permitted.
  • Changes to files cloaked by the repo should be flagged.
  • [Optional] Version.Details.xml could be validated
  • Submodules are not added in the VMR

Goal

  • [ ] Add a new command to darc (something like darc verify but maybe darc vmr verify-changes or something) which would run the above mentioned checks against the changeset (it should probably understand what is the target branch of the PR).
  • [ ] Run the validation in VMR PRs

premun avatar Oct 11 '23 12:10 premun

~I think, we will need to validate that version files were updated correctly.~ ~If people are resolving conflicts in their PR branch, they might resolve the source-mappings.json or the Version.Details.xml <Source> tag wrong.~

~We should check that the build ID in the PR corresponds with the SHAs correctly.~

This is now done via the Maestro checks.

premun avatar Jan 14 '25 08:01 premun

~We possibly should not allow changes under src/nuget-client because there is no backflow~

premun avatar Apr 24 '25 12:04 premun

https://github.com/dotnet/dotnet/pull/760 added validation to prevent changes to eng/common, Versions.props, and Versions.Details.xml if a PR was not opened by dotnet-sb-bot or dotnet-maestro. As part of this work, we should consider removing the validation added in https://github.com/dotnet/dotnet/pull/760 in favor of adding the logic to darc instead.

ellahathaway avatar May 29 '25 19:05 ellahathaway

We have decided that this will not be part of darc but a test project directly in the VMR (it can depend on DarcLib) and we will build it and run it directly there so that we have a tighter loop.

We will incorporate Ella's checks too.

premun avatar Jun 12 '25 12:06 premun

@mmitche would it be valuable to have a place in the VMR where we would designate which repos have no backflow and direct changes to them are warned about / forbidden? We'd use it for 2xx, 3xx, .. and for preview branches to alert when people make changes there.

What would be a good place for that? Some version files?

premun avatar Jun 12 '25 12:06 premun

@premun and I met offline to discuss this issue. Here are the points that came out:

  • We'll implement it in several stages. On the first stage, we'll just create a new stage in the pipeline to validate submodules in the VMR. In the second stage we can port Ella's validations into the pipeline and get rid of them in github workflows.
  • Check whether it is possible to propagate important warnings into the github PR UI from the azdo build pipeline. If not, consider failing the build when any of these checks fail, and if they are justified, repo owners can merge it despite the red build.
  • We'll add a .csproj file, probably somewhere in dotnet/eng/tools directly inside the VMR in order to perform the validation.
  • We'll need to use darclib for some of the validation parts, so we should also add darclib to the VMR's version.details.xml so that the package stays updated.
  • The pipeline should run only on PR builds, not nightly builds or any others.

adamzip avatar Jun 12 '25 15:06 adamzip

  • Third stage would be implementing the rest of the required checks specified in the issue. Second and third stage can go in whichever order (even together because the second one might be just about the removal).
  • The checks should only run for non-bot account PRs.

premun avatar Jun 12 '25 15:06 premun

@mmitche would it be valuable to have a place in the VMR where we would designate which repos have no backflow and direct changes to them are warned about / forbidden? We'd use it for 2xx, 3xx, .. and for preview branches to alert when people make changes there.

What would be a good place for that? Some version files?

My concern with checking this info is that you're forced to have data about subscriptions in two places:

  • Maestro
  • Checked into the repo.

E.g. when you branch for a preview, you're forced to go and correctly update this state in order to get correct results. This info will shift over time subtly. Repos that don't branch for a release (command line API) will have backflow from 10.0, but then at some point the backflow will switch to 11.0. Again you'll need to keep this updated.

I think the version files are the wrong place for this information. The version files and source code only represents things as they currently are, not as they could be. The subscription information represents that today.

  • I think these checks should not be based source alone. They should query Maestro state to understand what flows and what doesn't
  • That either means that Maestro needs to implement the check, or.....
  • Alternatively, you could use WIF in GH and have an Action that queries the Maestro API to determine the current flow state.

I kind of still lean towards the Maestro implementation but i understand the concern around rate limiting. Maestro would just listen for pushes to PRs across the org/dotnet repo alone, check the contents, view the subscription state, and update a check. This could be really really useful overall. You could expose information about where a given PR will flow once checked in and built, for instance.

mmitche avatar Jun 12 '25 15:06 mmitche

Maestro would have to start checking all PRs then because this will only run on non-maestro ones.

WIF could work but we currently don't have read-only permissions level in Maestro implemented. So I am a bit worried to open this as a harmful PR could write to BAR with this.

premun avatar Jun 12 '25 16:06 premun

What does the total pushes/hr look like over the org?

mmitche avatar Jun 12 '25 16:06 mmitche

I think the numbers will be low. Let's say you restrict this to just VMR PRs. I think it would be an effect of N on your rate limit, where N is the number of PRs updated/hr in the VMR. That's probably a very small number today.

mmitche avatar Jun 12 '25 16:06 mmitche

I guess it's not that many and we could ignore Maestro's PRs too. It's just the new features we need to do for this to make it reliable. Thought it's not the end of the world, I wonder what's the RoI.

premun avatar Jun 12 '25 16:06 premun

I think the ROI is not duplicating information about subscriptions into the source. IMO this is not a good thing to do.

mmitche avatar Jun 12 '25 16:06 mmitche

I more meant the ROI of the check itself

premun avatar Jun 12 '25 16:06 premun

I don't think the ROI on the new infra is so high that I would prioritize this over other QoL improvements that you have on your radar. But we will need to get this sorted out eventually. Someone will shoot themselves in the foot at some point.

mmitche avatar Jun 16 '25 17:06 mmitche

@premun can you confirm this is the behavior we want with regards to cloaked files?

  1. If we detect that the PR is creating (or even modifying) a file that matches some the the exclusion rules, either a. warn the author about it, or b. fail the validation if there's no good way to warn
  2. If changes are made to the exclusion rules inside source-mappings.json, expect that any VMR files that match the rules are also deleted - and if they are not, warn the author about it

adamzip avatar Jun 26 '25 13:06 adamzip

Yep, sounds good

premun avatar Jun 26 '25 13:06 premun