terraform-provider-okta
terraform-provider-okta copied to clipboard
okta_app_saml and okta_app_oauth cause rate limit/timeout errors during plan phase
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Terraform Version
TF version 1.1.3
Affected Resource(s)
- okta_app_saml
- okta_app_oauth
Terraform Configuration Files
resource "okta_app_oauth" "mobile" {
label = "Mobile API Client"
type = "native"
response_types = [ "code" ]
grant_types = [
"refresh_token",
"authorization_code"]
redirect_uris = local.api_client_callback_urls
token_endpoint_auth_method = "none"
refresh_token_rotation = "STATIC"
# See https://registry.terraform.io/providers/okta/okta/latest/docs/resources/app_group_assignment
# When using this resource in conjunction with other application resources (e.g. okta_app_oauth) it is advisable to add the following lifecycle argument to the associated app_* resources to prevent the groups being unassigned on subsequent runs:
lifecycle {
ignore_changes = [groups]
}
}
resource "okta_app_saml" "saml_app" {
label = "SAMLApplication"
sso_url = local.env.xxx
recipient = local.env.xxx
destination = local.env.xxx
audience = local.env.xxx
assertion_signed = true
response_signed = true
signature_algorithm = "xxx"
digest_algorithm = "xxx"
honor_force_authn = true
idp_issuer = local.env.saml_issuer
authn_context_class_ref = "urn:oasis:names:tc:SAML:2.0:xxxx"
subject_name_id_template = "xxx"
subject_name_id_format = "xxx"
# See https://registry.terraform.io/providers/okta/okta/latest/docs/resources/app_group_assignment
# When using this resource in conjunction with other application resources (e.g. okta_app_oauth) it is advisable to add the following lifecycle argument to the associated app_* resources to prevent the groups being unassigned on subsequent runs:
lifecycle {
ignore_changes = [groups]
}
}
Debug Output
https://gist.github.com/kostacasa/af28c7f01ece535ffc66f5bcd86a419c
Expected Behavior
Apps should be updated without rate limits being hit (that ultimately cause timeouts).
Actual Behavior
Our org hit rate limits during the plan phase as shown below:
Steps to Reproduce
Running tf plan
is enough.
Regression in Okta Terraform Provider seems to have been introduced between versions 3.12.1
and 3.13.1
. The former runs a plan on our org successfully, the latter (including all versions up to latest) cause timeouts during the plan phase.
We also attempted to utilize the max_api_capacity
parameter which prevented the rate limit from occurring, but the plan phase still timed out after 15 minutes.
I would draw attention to the URL that is in the debug output gist which shows which URLs are timing out:
Get "https://xxx.okta.com/api/v1/apps/xxx/users?after=xxx
&limit=200"
Key part bolded - the xxx
part is an actual user ID from our org. It looks like the provider is attempting to paginate through the entire user base to perform a diff, 200 at a time. Since our org has tens of thousands of users, it appears that this takes too long (and causes rate limits).
We tried adding following properties to the resource:
skip_users = true
skip_groups = true
And following ignores to lifecycle:
ignore_changes = [groups, users]
But neither avoided the problem.
Important Factoids
These apps have close to 100k users assigned to them which I suspect matters. Our sandboxes with smaller user numbers do not experience this issue.
References
- #0000
@kostacasa thanks for all the details. I will see if we can address this in the next release.
+1 A lot of requests done on these endpoints: "current request "GET /api/v1/apps/<app_id>/users"
My use-case is only using a datasource for okta_app_oauth or okta_app_saml to retrieve the app_id without any interaction with its user base
+1 on this as well.
When looking at the TRACE logs, I'm seeing some of these:
2022-03-31T13:46:27.391-0700 [DEBUG] provider.terraform-provider-okta_v3.22.1: 2022/03/31 01:46:27 [INFO] Throttling API requests; sleeping for 0 seconds until rate limit reset (path class "app-id": 96 remaining of 500 total); current request "GET /api/v1/apps/0oaxxxx/users/00xxxxxxx"
But looking at the rate limit dashboard, seems Okta is bucketing these all under v1/apps*. v1/apps/{id} wasn't even really touched.

Thanks for the details @tantran-falconx this is very helpful for my investigation.
Any updates on this issue? We are still unable to upgrade to the latest versions of the provider because of this. Our prod org always times out during plan phase.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days
Please keep open
@kostacasa @antonmos are you still having an issue here. A while back I refactored the rate limiting algorithm to be driven off of real accounting in the Okta monolith's integration tests. I haven't heard much from anyone having rate limiting issues any longer.
Not happening any more for me! Thank you for fixing!
We'll call this one done