community icon indicating copy to clipboard operation
community copied to clipboard

REQUEST: Repository maintenance on opentelemetry-dotnet & opentelemetry-dotnet-contrib

Open CodeBlanch opened this issue 1 year ago • 9 comments

Affected Repository

  • https://github.com/open-telemetry/opentelemetry-dotnet
  • https://github.com/open-telemetry/opentelemetry-dotnet-contrib

Requested changes

I have three things I need done. See Purpose section for details.

  • I have a GitHub App named "OpenTelemetry .NET Automation". I want to transfer this to the open-telemetry organization. Someone with admin will need to approve this.
  • Once transferred to open-telemetry org I need an admin to give @open-telemetry/dotnet-maintainers access to manage that app. This way all currently/future maintainers will have access.
  • I need this app installed into opentelemetry-dotnet & opentelemetry-dotnet-contrib. Admin may need to do that or the manage access might be enough for maintainers to do this.

Purpose

I am working on automating the release process and a bunch of maintenance tasks for dotnet. There are a lot of steps. We open PRs. Create Releases. Push tags. Invoke workflows in contrib repo.

The challenge is when these things are done via GitHub actions they don't trigger other workflows. There are a few recommendations published by GitHub to solve this: https://github.com/peter-evans/create-pull-request/blob/main/docs/concepts-guidelines.md#workarounds-to-trigger-further-workflow-runs

The one I am attempting to implement with this app is: https://github.com/peter-evans/create-pull-request/blob/main/docs/concepts-guidelines.md#authenticating-with-github-app-generated-tokens

I don't like the Personal Access Token (PAT) approach because everything will look like it is being done by a person (probably me) and the tokens expire. SSH key only works for pushes. Machine account won't work for kicking off workflows in contrib.

The release process is like this (noted where tokens will help)...

  • Workflow is manually invoked to kick things off. A tag\version is chosen.
  • Workflow opens a PR to update CHANGELOGs and public api files. A token is needed here so this PR triggers a CI workflow.
  • A maintainer has to merge this PR.
  • Once the release PR is merged another workflow creates a tag. A token is needed here to trigger workflows which spawn on tag push.
  • A release workflow triggers on the tag push and creates a github release. A token is needed here to trigger workflows which spawn on release publish.
  • A release workflow triggers on the release publish. It performs these actions...
    • A cleanup PR is opened to update some metadata for the latest version. A token is needed here to trigger CI workflow on that PR.
    • We call a workflow in contrib to notify it of the new release. That workflow kicks off a similar process in contrib. A token is needed here to call into contrib.

Expected Duration

Permanent

Repository Maintainers

@open-telemetry/dotnet-maintainers

CodeBlanch avatar May 21 '24 17:05 CodeBlanch

The challenge is when these things are done via GitHub actions they don't trigger other workflows. There are a few recommendations published by GitHub to solve this: https://github.com/peter-evans/create-pull-request/blob/main/docs/concepts-guidelines.md#workarounds-to-trigger-further-workflow-runs

would this work for you? https://github.com/open-telemetry/community/blob/main/assets.md#opentelemetry-bot

(several other OpenTelemetry repos are using this approach for automations: https://github.com/search?q=org%3Aopen-telemetry+OPENTELEMETRYBOT_GITHUB_TOKEN+language%3AYAML&type=code)

trask avatar May 21 '24 18:05 trask

@trask I will try it out and report back.

CodeBlanch avatar May 21 '24 19:05 CodeBlanch

@trask

So I got everything to work using opentelemetrybot: https://github.com/open-telemetry/opentelemetry-dotnet/pull/5662

The big challenge I faced is trying to test everything via forks is very heavy lifting. I had to create my own org. I had to create my own bot account. I had to try and get everything as close as possible to the real things. I had to make the actual automation fully configurable so forks can use different accounts/users to support testing scenarios. I'll have to document this whole mess so others can work on it in the future.

The app approach above is much easier to work with IMO. For a fork you can create an app in your personal account and install it into your forks. You can give that app fine-grained access to whatever it needs to do (more so than the PAT). When you need a token you ask the app for one based on the specific thing you need to do in a workflow and when the workflow is done that token is revoked. No ephemeral PAT needed, no user accounts needed. I think it is worth looking at for the org as a better solution overall.

That being said, I'm fine living with this opentelemetrybot.

CodeBlanch avatar May 29 '24 19:05 CodeBlanch

@CodeBlanch you might be interested in this SIG/project proposal by @austinlparker:

  • #2096

My personal opinion on that matter is, that if we have an app, it should not be exclusive to a single repository and we should rather look into generalizing solutions across repositories.

svrnm avatar Jun 03 '24 15:06 svrnm

we should rather look into generalizing solutions across repositories.

Definitely agree with this in general.

For this "use an app for granting permissions to automation" case though we may want to look closely at the implications. Let's say this app is installed in dotnet, dotnet-contrib, and java repos. The app will need permission to do everything required by automation for those repos. In dotnet we need to create branches, push commits, open PRs, create releases, and kick off actions. Let's say in java it needs to do package management. If we share an app, suddenly dotnet & dotnet-contrib can do package management. Also dotnet will be able to perform things in java's repo. Some isolation in this case I think is a good thing. Keeps the scope of permissions down and limits which repos can interact with each other. Just food for thought!

CodeBlanch avatar Jun 03 '24 19:06 CodeBlanch

moving from area/repo-maintenance to area/project-infra to see if there's a better way to do this in the future

trask avatar Aug 27 '24 20:08 trask

@trask @CodeBlanch would the new otelbot app work for this use case?

austinlparker avatar Mar 06 '25 19:03 austinlparker

@austinlparker Sure any app would work. But see my note above about permissions. The app in dotnet would need to be able to push commits, push tags, post comments on PRs, lock/unlock PRs, create branches, create/edit releases, etc. Fine to grant permission to the app to do that. But I think those permissions would apply wherever the app is installed. My concern would be as other SIGs have needs the permissions for the app would grow and grow sort of creating a honey pot for would-be attackers to access all repos. That being said, OpenTelemetryBot probably has the same issue 😄 Anyone who got access to its PAT or credentials would probably gain some sweeping access.

Having dedicated apps for different SIGs I think has the following benefits:

  • Potentially lets SIG maintainers self-service.
  • Removes the risk of having the app with more permission than it needs across repos.

CodeBlanch avatar Mar 06 '25 19:03 CodeBlanch

My concern would be as other SIGs have needs the permissions for the app would grow and grow sort of creating a honey pot for would-be attackers to access all repos.

yeah, I'd like to keep the otelbot app permission requirements pretty minimal for this reason, currently just

  • Read access to metadata
  • Read and write access to pull requests

this has been enough for the Java release process, but if a repo needs more permissions, then I agree with creating a separate app for each repo that needs more permissions, e.g. otelbot-dotnet

trask avatar Apr 14 '25 20:04 trask