terraform-provider-okta okta_app_saml and okta_app_oauth cause rate limit/timeout errors during plan phase

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

TF version 1.1.3

Affected Resource(s)

okta_app_saml
okta_app_oauth

Terraform Configuration Files

resource "okta_app_oauth" "mobile" {
  label          = "Mobile API Client"
  type           = "native"

  response_types = [ "code" ]
  grant_types    = [
    "refresh_token",
    "authorization_code"]

  redirect_uris  = local.api_client_callback_urls
  token_endpoint_auth_method = "none" 
  refresh_token_rotation = "STATIC"

  # See https://registry.terraform.io/providers/okta/okta/latest/docs/resources/app_group_assignment
  # When using this resource in conjunction with other application resources (e.g. okta_app_oauth) it is advisable to add the following lifecycle argument to the associated app_* resources to prevent the groups being unassigned on subsequent runs:
  lifecycle {
     ignore_changes = [groups]
  }
}

resource "okta_app_saml" "saml_app" {
  label                    = "SAMLApplication"
  sso_url                  = local.env.xxx
  recipient                = local.env.xxx
  destination              = local.env.xxx
  audience                 = local.env.xxx
  assertion_signed         = true
  response_signed          = true
  signature_algorithm      = "xxx"
  digest_algorithm         = "xxx"
  honor_force_authn        = true
  idp_issuer               = local.env.saml_issuer
  authn_context_class_ref  = "urn:oasis:names:tc:SAML:2.0:xxxx"
  subject_name_id_template = "xxx"
  subject_name_id_format   = "xxx"

  # See https://registry.terraform.io/providers/okta/okta/latest/docs/resources/app_group_assignment
  # When using this resource in conjunction with other application resources (e.g. okta_app_oauth) it is advisable to add the following lifecycle argument to the associated app_* resources to prevent the groups being unassigned on subsequent runs:
  lifecycle {
     ignore_changes = [groups]
  }
}

Debug Output

https://gist.github.com/kostacasa/af28c7f01ece535ffc66f5bcd86a419c

Expected Behavior

Apps should be updated without rate limits being hit (that ultimately cause timeouts).

Actual Behavior

Our org hit rate limits during the plan phase as shown below: okta_rate_limit

Steps to Reproduce

Running tf plan is enough.

Regression in Okta Terraform Provider seems to have been introduced between versions 3.12.1 and 3.13.1. The former runs a plan on our org successfully, the latter (including all versions up to latest) cause timeouts during the plan phase.

We also attempted to utilize the max_api_capacity parameter which prevented the rate limit from occurring, but the plan phase still timed out after 15 minutes.

I would draw attention to the URL that is in the debug output gist which shows which URLs are timing out: Get "https://xxx.okta.com/api/v1/apps/xxx/users?after=xxx&limit=200"

Key part bolded - the xxx part is an actual user ID from our org. It looks like the provider is attempting to paginate through the entire user base to perform a diff, 200 at a time. Since our org has tens of thousands of users, it appears that this takes too long (and causes rate limits).

We tried adding following properties to the resource:

  skip_users = true
  skip_groups = true

And following ignores to lifecycle:

ignore_changes = [groups, users]

But neither avoided the problem.

Important Factoids

These apps have close to 100k users assigned to them which I suspect matters. Our sandboxes with smaller user numbers do not experience this issue.

References

#0000

Mar 10 '22 02:03 kostacasa

@kostacasa thanks for all the details. I will see if we can address this in the next release.

Mar 10 '22 16:03 monde

+1 A lot of requests done on these endpoints: "current request "GET /api/v1/apps/<app_id>/users"

My use-case is only using a datasource for okta_app_oauth or okta_app_saml to retrieve the app_id without any interaction with its user base

Mar 24 '22 09:03 Cylock

+1 on this as well.

When looking at the TRACE logs, I'm seeing some of these:

2022-03-31T13:46:27.391-0700 [DEBUG] provider.terraform-provider-okta_v3.22.1: 2022/03/31 01:46:27 [INFO] Throttling API requests; sleeping for 0 seconds until rate limit reset (path class "app-id": 96 remaining of 500 total); current request "GET /api/v1/apps/0oaxxxx/users/00xxxxxxx"

But looking at the rate limit dashboard, seems Okta is bucketing these all under v1/apps*. v1/apps/{id} wasn't even really touched.

Mar 31 '22 21:03 tantran-falconx

Thanks for the details @tantran-falconx this is very helpful for my investigation.

Mar 31 '22 23:03 monde

Any updates on this issue? We are still unable to upgrade to the latest versions of the provider because of this. Our prod org always times out during plan phase.

Jun 07 '22 21:06 kostacasa

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days

Jan 22 '23 00:01 github-actions[bot]

Please keep open

Jan 22 '23 21:01 antonmos

@kostacasa @antonmos are you still having an issue here. A while back I refactored the rate limiting algorithm to be driven off of real accounting in the Okta monolith's integration tests. I haven't heard much from anyone having rate limiting issues any longer.

Mar 11 '23 00:03 monde

Not happening any more for me! Thank you for fixing!

Mar 11 '23 15:03 antonmos

We'll call this one done

Mar 11 '23 16:03 monde

terraform-provider-okta terraform-provider-okta copied to clipboard

okta_app_saml and okta_app_oauth cause rate limit/timeout errors during plan phase

Community Note

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

terraform-provider-okta
terraform-provider-okta copied to clipboard