sourcegraph Per batch change control for pushing to forks (Tracking)

Context

We have recently introduced the possibility to push branches created by Batch Changes to a fork of the repository, rather than the repository itself.

This was built as an MVP to address:

Most users do not have write permission on all repositories, so branch creation fails for them (https://github.com/sourcegraph/accounts/issues/544)
Some companies enforce that all PRs created by automation are opened on forks. (https://github.com/sourcegraph/accounts/issues/565, https://github.com/sourcegraph/accounts/issues/4778 )

Problem

The current MVP state is the end state solution for companies that mandate using forks for all repositories. It's not a fully mature solution though for users that don't have write-permission on all repositories and only want to use forks in some cases. In particular:

not all users want to use forks
a given user wants to use forks, but only for a specific batch change
a user wants to create forks, only if they're missing permissions
site admins aren't aware of the fork setting, or don't set it up, resulting in https://github.com/sourcegraph/sourcegraph/issues/40922 not being solved despite the current minimal fork support (e.g. https://github.com/sourcegraph/accounts/issues/544)

Impacted customers

https://github.com/sourcegraph/accounts/issues/6859 (insight)
https://github.com/sourcegraph/accounts/issues/544
https://github.com/sourcegraph/accounts/issues/6716
https://github.com/sourcegraph/accounts/issues/7485

Next iteration

Introduce a fork attribute, allowing users to specify the fork
The site admin settingenforceForks allows admins to force that all branches are created on forks.
- TBD: Add a UI message that surfaces this to the user? "Some changesets have been created on a fork because ..."
Allow forks to be deleted when a Batch Change is closed or deleted (Source Control hygiene)

The fork attribute behaves in the way proposed by @LawnGnome (whose legacy lives on forever):

changesetTemplate:
  fork: foo           # forks into the `foo` organisation/user, assuming the
                      # user has permission
  fork: LawnGnome     # forks into the user's namespace as well (assuming the
                      # user is LawnGnome), but explicitly by name
  fork: user          # forks into the user's namespace, same as the current
                      # switch

Tracked issues

@unassigned

Completed

[x] (🏁 62 days ago) https://github.com/sourcegraph/sourcegraph/issues/50850

@adeola-ak

Completed

[x] (🏁 35 days ago) https://github.com/sourcegraph/sourcegraph/issues/50849

@courier-new

Completed

[x] (🏁 35 days ago) https://github.com/sourcegraph/sourcegraph/issues/50849

Mar 25 '22 18:03 malomarrec

Hey, @sourcegraph/batchers (@eseliger @LawnGnome @courier-new @adeola-ak @BolajiOlajide @malomarrec @chrispine @danielmarquespt) - we have been mentioned. Let's take a look.

Mar 25 '22 18:03 github-actions[bot]

I was curious to understand what was preferrable: have a fork attribute in the spec, or have a "smarter" behaviour, where batch changes would automatically create forks if it encounters a permission error. An interesting argument at https://github.com/sourcegraph/accounts/issues/6859 was:

IMO it’d be preferable to have a fork attribute. orgs should be using one pattern or the other. If they’re using a mixed model, the fork attribute might encourage them into one camp or the other

Mar 25 '22 18:03 malomarrec

Agree with the sentiment.

Mar 28 '22 20:03 eseliger

I think it should be an explicit attribute in the spec, and I'd even go a step further and say it should be the target repo name in a form we can parse, not just a boolean. I'd go with something like these options (spitballing, obviously — only one fork would be valid, but just trying to show the range of options I think we should support):

changesetTemplate:
  fork: foo           # forks into the `foo` organisation/user, assuming the
                      # user has permission
  fork: LawnGnome     # forks into the user's namespace as well (assuming the
                      # user is LawnGnome), but explicitly by name
  fork: user          # forks into the user's namespace, same as the current
                      # switch

So, basically, fork would be a string pointing to the owner to fork into, or the magic string user, which corresponds to the current user. This allows us to support use cases where the fork goes into a team or company organisation, rather than an individual user's organisation.

Bonus side note: I'd also like us to consider (eventually) adding a parallel remote entry to provide an escape hatch for weird edge cases we can't support with forks, which would be particularly useful on code hosts that don't support forking (not an issue today, but who knows where we're going). I could see that looking like this:

changesetTemplate:
  remote: foo/bar                   # pushes to an existing remote that we
                                    # won't attempt to fork, but it is assumed
                                    # already exists, using the code host of
                                    # the target repo to construct a push URI
  remote: [email protected]:foo/bar    # pushes to an explicit remote URI

Mar 29 '22 03:03 LawnGnome

From https://github.com/sourcegraph/accounts/issues/6716

Trying to run a batch change to change a team name in a few repos, but it happens that the user doesn’t have push access to those repos where the change needs to occur, which means their PAT doesn’t allow creating a branch in the repos and hence the batch change fails since currently Sourcegraph tries to make a branch in the same repo where it creates a PR. An alternative workflow could be to fork all the repos into the user’s account if not already, update the fork, create a new branch, make the change, push it and then finally create the PR to the main repo (the fork’s parent).

We let them know about the instance-wide setting, waiting to here back here

Sep 22 '22 10:09 malomarrec

We never heard back from the customer above. I'm digging into this again, followed up here.

Jan 27 '23 12:01 malomarrec

We already have what’s needed for forking, and Kelli’s current work will also make it easier to track where a fork actually lives, so all that remains would be adding it to the batch spec and also using that value in reconciler. It is not clear to me yet what customers would expect when they remove or change the value, though - this is where it would become tricky. Do we say it doesn’t affect open ones?

Some more questions to solve, without that question answered it sounds fairly straightforward, maybe a week. If we decide to respect changes in the spec.. likely much more.

Jan 27 '23 13:01 eseliger

We should finish this sooner rather than later, so moving to Triage. (Doesn't need to happen this current sprint.)

Jan 27 '23 16:01 chrispine

I'd recommend that updates to the spec warn, but don't update to cut scope.

Reasoning:

sounds wasteful to remove and reopen in a fork. A PR is short lived anyways
and it cut scopes.

That's breaking our model a bit though so that's why we need to document loudly + warn. WDYT @eseliger ?

Jan 27 '23 17:01 malomarrec

this isn't the first time we don't support a set of state transitions, the same applies for open -> draft and open: true -> open: false, for example. I'm not extremely concerned about that not being part of the initial scope.

Jan 27 '23 18:01 eseliger

I don't think we should expect to prioritize this until at least after Starship launch. I'd suggest moving it back to backlog for now until we have a better sense of what comes after Starship.

Feb 23 '23 06:02 courier-new

Additional feedback from our company.

Most users do not have write permission on all repositories, so branch creation fails for them

Our company adopted the fork option. We maintain a single large mono-organization, but allow teams to setup their own organizations if they can be more effective (such as org-wide Apps, Secrets, etc.). When we initially setup Batch Changes, it would attempt to create branches on repositories, but that would fail a large number of times for repositories in organizations the Batch Change creator did not have write access to (Though they would have read access). So, to ensure that any Batch Change submitted by an engineer would successfully lead to the creation of pull requests, we set the fork option.

not all users want to use forks a user wants to create forks, only if they're missing permissions

So this is where we are today. Batch Changes submitted by engineers are consistently getting created successfully. However, we have now introduced a pain point for teams receiving Batch Changes. As I mentioned earlier, a large number of teams are in our company's mono-organization, and that mono-organization allows limited write access. For repositories in the mono-organization, the general feedback is that receiving teams would prefer if the Batch Changes were submitted as a branch. They already work on branches as a general practice, they might not have something like fetch = +refs/pull/*/head:refs/remotes/origin/pr/* setup to pull pull requests, and they may need to modify the Batch Change within the repository to complete it.

So... I think we would like to adopt a workflow where Batch Changes attempts to submit branches on repositories, and only create a fork as a fallback. If a team still wants to avoid a fork from Batch Change submitters they can, add the user to their repository with write permissions, opt of out Batch Changes, or move their repository into the mono-organization.

May 25 '23 16:05 hutson

sourcegraph sourcegraph copied to clipboard

Per batch change control for pushing to forks (Tracking)

Context

Problem

Impacted customers

Next iteration

Tracked issues

sourcegraph
sourcegraph copied to clipboard