terraform-provider-github icon indicating copy to clipboard operation
terraform-provider-github copied to clipboard

app_auth credentials expire after an hour

Open jcogilvie opened this issue 4 years ago • 14 comments

Terraform Version

Run terraform -v to show the version. If you are not running the latest version of Terraform, please upgrade because your issue may have already been fixed.

0.13.5

Affected Resource(s)

All github resources that require authentication.

Terraform Configuration Files

provider "github" {
  owner = "MyOrg"
  app_auth {
     # correct values are set on the command line
  }
}

Debug Output

Mon, 15 Nov 2021 00:50:18 GMT
/home/runner/work/_temp/82078d0e-8462-4088-a73b-519aff5c7a56/terraform-bin refresh --parallelism 50
Mon, 15 Nov 2021 00:50:46 GMT
module.my_module.data.github_team.core: Refreshing state... [id=...34]
[...many successful objects being refreshed...]
Mon, 15 Nov 2021 01:50:48 GMT
aws_iam_user_policy.tf_user: Refreshing state... [id=my-repo:state-access]
Mon, 15 Nov 2021 01:50:48 GMT
Error: GET https://api.github.com/repos/MyOrg/my-repo: 403 API rate limit of 60 still exceeded until 2021-11-15 02:50:44 +0000 UTC, not making remote request. [rate reset in 59m57s]

Panic Output

None

Expected Behavior

Authentication should be valid for as long as it takes terraform to run.

Actual Behavior

Credentials time out after an hour. Installation Access Tokens are valid for exactly one hour:

Installation access tokens have the permissions configured by the GitHub App and expire after one hour.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform refresh

Important Factoids

We have a very large set of repositories managed by the github provider. In attempting to switch from OAuth token authentication to GH Apps authentication, we have discovered that after approximately an hour (+ a few seconds usually) we start getting 403s to resources that are expressily allowed by our GH App permissions.

jcogilvie avatar Nov 15 '21 02:11 jcogilvie

👋 Hey Friends, this issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Please add the Status: Pinned label if you feel that this issue needs to remain open/active. Thank you for your contributions and help in keeping things tidy!

github-actions[bot] avatar Nov 30 '22 16:11 github-actions[bot]

Pinning this issue as I believe it's still important and relevant.

kfcampbell avatar Dec 01 '22 00:12 kfcampbell

@jcogilvie The link you provided doesn't have that phrase. It was probably moved here. But it looks like they're talking about access tokens which supposedly have nothing to do with it. If I understand you correctly, for an hour terraform invocations succeed, after an hour they stop succeeding. Also the debug output ends with a rate limit error message. Which supposedly means that the issue is not with expiration.

On the other hand I do seem to experience expiration issues. Here's a brief explanation of how I gave terraform access to GitHub. That is, it's about making API request, but in case of terraform you probably don't need to obtain an access token. And you certainly don't need to make API requests. But AFAICS you need to activate a device at https://github.com/login/device Whatever that means. And I guess that's the thing that expires in my case. I'm not exactly sure, but I'm going to confirm it in the near future.

Which brings up the question, did you not need to activate a device to make it work in the first place?

x-yuri avatar Jun 20 '23 08:06 x-yuri

My issue is specific to logging in as an app installation with the GitHub App app_auth field of the provider, given an app ID, an app installation ID, and a PEM file.

I can't speak to the internal workings of the provider, but it's more than likely that it exchanges those credentials for a token that's valid for an hour.

jcogilvie avatar Jun 20 '23 15:06 jcogilvie

My issue is also specific to logging in as an app installation with the GitHub app using the app_auth {} block.

The provider indeed most likely obtains an access token, but it doesn't seem like it stores it anywhere. It seems like GitHub somehow remembers for a while who made a request. And I think it has something to do with device activation at https://github.com/login/device.

Which brings up the question I asked, what exactly did you do to make the authentication work? What I did is, I believe, the steps 1 and 2 from this gist, and activated the device at https://github.com/login/device.

That is, did you activate the device before it started working? After it stops working does activating the device make it resume?

x-yuri avatar Jun 20 '23 16:06 x-yuri

I don't believe I had to activate anything, but then my first-time setup was ages ago so I could be wrong.

I see the provider here saving the app auth token as its auth token in its config object, in the same way it would store a "standard" token.

jcogilvie avatar Jun 20 '23 16:06 jcogilvie

I meant that I don't think it stores an access token somewhere on disk. I don't think that your case is that you run terraform, it works for an hour, and after an hour it fails. Your case is, you occasionally run terraform and after an hour from some point X it stops working, right?

x-yuri avatar Jun 20 '23 18:06 x-yuri

No, my use case is the former. For terraform runs longer than 1hr using app_auth credentials, it errors after an hour.

jcogilvie avatar Jun 21 '23 01:06 jcogilvie

We have the same issue as our terraform plan with full refresh takes more than 1 hour, because we are managing full organisation with 1000+ repositories as code and we have more then 10'000 objects in state. So provisioning GitHub App with pem seems still goes with OAuth flow It will make more sense to use direct auth flow (if available) for GitHub App because terraform already can read pem file and there is no way to extend 1 hour token as this value is coming from GitHub itself. Or if we can rotate expired token with provider - it may be another vector to solve the issue

mkushakov avatar Jul 10 '23 10:07 mkushakov

@jcogilvie @mkushakov On a side note, do your terraform projects define all your environments (production, staging, ...)?

When you first start using Terraform, you might be tempted to define all of your infrastructure in a single Terraform file or a single set of Terraform files in one folder. The problem with this approach is that all of your Terraform state is now stored in a single file, too, and a mistake anywhere could break everything.

For example, while trying to deploy a new version of your app in staging, you might break the app in production. Or, worse yet, you might corrupt your entire state file, either because you didn’t use locking or due to a rare Terraform bug, and now all of your infrastructure in all environments is broken (here’s a colorful example of what happens when you don’t isolate Terraform state.)

https://blog.gruntwork.io/how-to-manage-terraform-state-28f5697e68fa

https://charity.wtf/2016/03/30/terraform-vpc-and-why-you-want-a-tfstate-file-per-env/

And to add to that, is a big terraform project really what you want? It's not like it's necessarily bad. But I suggest you to think about it.

x-yuri avatar Jul 11 '23 17:07 x-yuri

We have federated infrastructure with multiple workspaces. The only monolith is our terraform representation of git, which doesn't have a preproduction.

I am not running into this bug every day and have indeed refactored around the slowness of the GitHub API on several occasions; but it is still a bug, and when it does occur, it is especially impactful because we lose an hour of work on an already large terraform project. It's not like a quick rerun can solve it.

jcogilvie avatar Jul 11 '23 19:07 jcogilvie

And to add to that, is a big terraform project really what you want? It's not like it's necessarily bad. But I suggest you to think about it.

Indeed, we know that it is our technical debt to refactor our project to different states. But as right now it will required a lot of work to group and migrate ~10'000 objects. And we want to make sure that code representation match actual situation in GitHub. Anyway, beside of huge state, it is not so uncommon to run terraform for more then 1 hour to avoid API rate limits. So I think the request still valid.

mkushakov avatar Jul 12 '23 07:07 mkushakov

We are managing a much smaller estate 500+ repos and we are still hitting token timeouts. Would be great to see a resolution on this.

ChrisLopezTide avatar Jun 05 '24 10:06 ChrisLopezTide

@nickfloyd can we get the Priority: High label back on this?

ChrisLopezTide avatar Jun 27 '24 08:06 ChrisLopezTide

to get the 60min expiry of the token was looking at code cant we just use PAT by setting the env GITHUB_TOKEN , and the config.go code oauth2.StaticTokenSource will simply use this token isnt? and provided I do not set other variables like GITHUB_APP_ID, GITHUB_APP_INSTALLATION_ID, or GITHUB_APP_PEM_FILE ? please can you confirm my understanding?

sreejesh123 avatar Jul 03 '25 20:07 sreejesh123