terraform-provider-digitalocean
                                
                                
                                
                                    terraform-provider-digitalocean copied to clipboard
                            
                            
                            
                        429 Too many requests or 429 API Rate limit exceeded
Terraform Version
Terraform v0.13.5
+ provider registry.terraform.io/-/aws v3.22.0
+ provider registry.terraform.io/-/digitalocean v2.3.0
+ provider registry.terraform.io/digitalocean/digitalocean v2.3.0
+ provider registry.terraform.io/hashicorp/aws v3.22.0
Expected Behavior
Terraform to complete successfully
Actual Behavior
Terraform is getting errors back from DO API stating either Too Many Requests or API Rate limit exceeded.
Steps to Reproduce
Manage 100 domain records and it will fail while trying to refresh the state.
Important Factoids
I am managing 10 domains with about 10-20 records each. I have dropped parallelism down to 1 and I still can't get a successful run without hitting one of these two errors. This isn't an absurdly large terraform run, and I am unable to get past just a handful of state refreshes before it errors out.
+1. Same problem for me. I need to manage 200+ droplets.
Based on the Docs
Requests through the API are rate limited per OAuth token. Current rate limits:
5,000 requests per hour
250 requests per minute (5% of the hourly total)
So one thing you can possibly do is reach out to support and see if they can increase the limit for requests per minute. The only other thing you might want to investigate is looking at breaking down the terraform code into a module per domain so that you won't hit the limit. I realize this might not be possible or suitable for your use case by something to think about.
+1.
Hundreds of domain records (same domain) to manage.
+1
This happens to me while managing a single large domain (~ 150 records)
module.spaces.digitalocean_cdn.assets: Refreshing state... [id=0ac3dd96-3612-4638-a802-10dadc4aecea]
╷
│ Error: Error retrieving DatabaseCluster: GET https://api.digitalocean.com/v2/databases/1cc4ed58-cd77-43c0-9a61-61062b2c12f7: 429 Too many requests
│
│   with module.databases.digitalocean_database_cluster.infra1,
│   on ../../modules/databases/main.tf line 11, in resource "digitalocean_database_cluster" "infra1":
│   11: resource "digitalocean_database_cluster" "infra1" {
│
╵
╷
│ Error: Error retrieving DatabaseCluster: GET https://api.digitalocean.com/v2/databases/33ce4a3f-1760-4e78-8110-5f4b356da009: 429 Too many requests
│
│   with module.databases.digitalocean_database_cluster.router1,
│   on ../../modules/databases/main.tf line 33, in resource "digitalocean_database_cluster" "router1":
│   33: resource "digitalocean_database_cluster" "router1" {
│
╵
╷
│ Error: Error retrieving Kubernetes cluster: GET https://api.digitalocean.com/v2/kubernetes/clusters/d52645d0-93e9-4594-aa9a-7b9dbecef8c5: 429 Too many requests
│
│   with module.kubernetes.digitalocean_kubernetes_cluster.app,
│   on ../../modules/kubernetes/kubernetes.tf line 2, in resource "digitalocean_kubernetes_cluster" "app":
│    2: resource "digitalocean_kubernetes_cluster" "app" {
│
╵
╷
│ Error: Error reading CDN: GET https://api.digitalocean.com/v2/cdn/endpoints/0ac3dd96-3612-4638-a802-10dadc4aecea: 429 Too many requests
│
│   with module.spaces.digitalocean_cdn.assets,
│   on ../../modules/spaces/main.tf line 44, in resource "digitalocean_cdn" "assets":
│   44: resource "digitalocean_cdn" "assets" {
│
╵
Releasing state lock. This may take a few moments...
I'm literally blocked here and I don't understand why. Our tf stack is not big by any means. DO has extremely low rate limits everywhere (e.g. using Loki with spaces is impossible). We're getting random API errors from k8s API, spaces API, CDN API returns 50x every so often. It's getting really frustrating.
Same problem here.
So one thing you can possibly do is reach out to support and see if they can increase the limit for requests per minute.
I just tried that. The answer was simply that they are doing this "to protect the platform". So they won't increase the limits.
I now asked if they could implement at least something like a burst limit which would allow to overreach the rate limit for a short amount of time. That would probably help with use cases such as Terraform.
Same problem here.
So one thing you can possibly do is reach out to support and see if they can increase the limit for requests per minute.
I just tried that. The answer was simply that they are doing this "to protect the platform". So they won't increase the limits.
I now asked if they could implement at least something like a burst limit which would allow to overreach the rate limit for a short amount of time. That would probably help with use cases such as Terraform.
I'm the only one having this problem in a team of three. My coworkers can plan. I have issued tokens that are used by applications, I wonder if they're using the API and I'm hitting a global limit.
Since @djmaze and me are hit by this at the same point in time, I wonder if there's something else going on.
@djmaze works now my limit was reset apparently a few minutes but I'll be keeping an eye. Here's how you can see the rate:
curl -H "Authorization: Bearer $DIGITALOCEAN_ACCESS_TOKEN" -v -I "https://api.digitalocean.com/v2/images?private=true"
...
< ratelimit-limit: 5000
ratelimit-limit: 5000
< ratelimit-remaining: 4998
ratelimit-remaining: 4998
< ratelimit-reset: 1634198529
ratelimit-reset: 1634198529
...
API documentation: https://docs.digitalocean.com/reference/api/api-reference/#section/Introduction/Rate-Limit
I am getting this reproducibly when running our Terraform plan two times in a row during one or two minutes. The first run works, the second one fails. This was always like this and has not changed for me.
Posting this here in case anyone else hits the same issue. We're using a remote secrets store (doppler) and the problem was that the entire team and a few apps were using a common token, fetched by the secrets store.
Today this hit me for the first API request while refreshing terraform state, without me doing anything on DO except a Kubernetes “login”.
I saw @tback creating a possible fix for it; any chance it will be released soon? (The last release is from two days before the fix landed)
We too regularly run into the per-minute rate-limiting of API calls. As such we too would greatly appreciate a solution to this. Be it in the form of an option for increased rate limits on DO's side, the Terraform provider automatically retrying failed calls, or even just adding support for rate limiting to the provider. We'd much prefer a deployment taking a few minutes longer, over it potentially failing.
Some background on our use-case, should it help: We regularly spawn on the order of 100 VMs for a few hours at a time, in order to profile distributed applications of ours. These deployments are automated with Terraform. Each VM we provision will lead to multiple API calls, as we also set up DNS records for each, assign it to a project, and so on.
As we regularly get rate limited this then requires multiple calls to terraform apply - spaced apart a few minutes each time - for the stack to be deployed fully. This is inconvenient when doing it manually, and a nightmare in CI.
I just installed the provider manually with a provider override and it worked for me as long as it took me to migrate away from DO.
+1 Would pay extra to not have to worry about rate-limiting the API, which I use to pay provision resources and therefore pay Digital Ocean more money. If there was an option to pay $X/mo for a much higher/unlimited API, I would choose that option at this moment. My other option is to spend time building rate-limiting into my side of the app, which is going to take way more of my time than $X/mo.
I have similarly filed support requests, as my modestly sized terraform stack is encountering these issues. Unfortunately their own terraform provider isn't usable on their platform.
Perhaps if backoff was added to the provider it would slow things down but not impossibly fail?
I've hit the same problems of rate limits (to the point at which I couldn't even run a single plan anymore) and opened a PR with proposed changes that fixes/mitigate this issue. If anyone would like to take a look at the fork: DanielHLelis/terraform-provider-digitalocean-ratelimit.
It basically limits the number of requests per second and/or uses the retryable HTTP client from Hashicorp to retry the request after an error (like the 422).
To try it you just need to install it and do the override as explained in the CONTRIBUTING.md.
Hoping to get it into mainstream soon.
Can you link the PR here too?
Updating here to say that I spent over a week going back and forth with support, and they were not at all helpful. Making me wonder if I should move my DNS elsewhere, if DNS + IAC isn't going to be well supported.
We've just released version 2.28.0 of this provider. It  adds experimental support for automatically retrying requests that fail with 429 or 500-level response codes. It can be enabled by setting the DIGITALOCEAN_HTTP_RETRY_MAX environment variable or the http_retry_max argument in the provider configuration.
Please let us know if you have any feedback on this functionality. We will be looking to enable it by default in a future release.
Additionally, it adds support for configuring client-side rate-limiting to enforce quality of service. It can be enabled by setting the DIGITALOCEAN_REQUESTS_PER_SECOND environment variable or the requests_per_second argument in the provider configuration.
Thanks to @DanielHLelis for working with us on this!
Thanks! This seems to be working initially - I'll do more testing and report back.
That's amazing @DanielHLelis! I've run the last release with 'requests_per_second' parameter and it appears to be working very well. I think that now I can merge all my terraform in one eheh
I reached the rate limit too, after setting requests_per_second the error has gone. Looks like it's doing its job well :)
With the recently released version 2.30.0 of this provider, we have now enabled retries by default. Setting the DIGITALOCEAN_HTTP_RETRY_MAX environment variable or the http_retry_max argument in the provider configuration to 0 will disable this behavior.