terraform-provider-digitalocean 429 Too many requests or 429 API Rate limit exceeded

Terraform Version

Terraform v0.13.5
+ provider registry.terraform.io/-/aws v3.22.0
+ provider registry.terraform.io/-/digitalocean v2.3.0
+ provider registry.terraform.io/digitalocean/digitalocean v2.3.0
+ provider registry.terraform.io/hashicorp/aws v3.22.0

Expected Behavior

Terraform to complete successfully

Actual Behavior

Terraform is getting errors back from DO API stating either Too Many Requests or API Rate limit exceeded.

Steps to Reproduce

Manage 100 domain records and it will fail while trying to refresh the state.

Important Factoids

I am managing 10 domains with about 10-20 records each. I have dropped parallelism down to 1 and I still can't get a successful run without hitting one of these two errors. This isn't an absurdly large terraform run, and I am unable to get past just a handful of state refreshes before it errors out.

Dec 30 '20 20:12 chasebolt

+1. Same problem for me. I need to manage 200+ droplets.

Jan 18 '21 15:01 zeppelinen

Based on the Docs

Requests through the API are rate limited per OAuth token. Current rate limits:

5,000 requests per hour
250 requests per minute (5% of the hourly total)

So one thing you can possibly do is reach out to support and see if they can increase the limit for requests per minute. The only other thing you might want to investigate is looking at breaking down the terraform code into a module per domain so that you won't hit the limit. I realize this might not be possible or suitable for your use case by something to think about.

Jan 18 '21 17:01 lfarnell

+1.

Hundreds of domain records (same domain) to manage.

Feb 07 '21 15:02 joaomnmoreira

+1

This happens to me while managing a single large domain (~ 150 records)

Aug 04 '21 00:08 ghomem

module.spaces.digitalocean_cdn.assets: Refreshing state... [id=0ac3dd96-3612-4638-a802-10dadc4aecea]
╷
│ Error: Error retrieving DatabaseCluster: GET https://api.digitalocean.com/v2/databases/1cc4ed58-cd77-43c0-9a61-61062b2c12f7: 429 Too many requests
│
│   with module.databases.digitalocean_database_cluster.infra1,
│   on ../../modules/databases/main.tf line 11, in resource "digitalocean_database_cluster" "infra1":
│   11: resource "digitalocean_database_cluster" "infra1" {
│
╵
╷
│ Error: Error retrieving DatabaseCluster: GET https://api.digitalocean.com/v2/databases/33ce4a3f-1760-4e78-8110-5f4b356da009: 429 Too many requests
│
│   with module.databases.digitalocean_database_cluster.router1,
│   on ../../modules/databases/main.tf line 33, in resource "digitalocean_database_cluster" "router1":
│   33: resource "digitalocean_database_cluster" "router1" {
│
╵
╷
│ Error: Error retrieving Kubernetes cluster: GET https://api.digitalocean.com/v2/kubernetes/clusters/d52645d0-93e9-4594-aa9a-7b9dbecef8c5: 429 Too many requests
│
│   with module.kubernetes.digitalocean_kubernetes_cluster.app,
│   on ../../modules/kubernetes/kubernetes.tf line 2, in resource "digitalocean_kubernetes_cluster" "app":
│    2: resource "digitalocean_kubernetes_cluster" "app" {
│
╵
╷
│ Error: Error reading CDN: GET https://api.digitalocean.com/v2/cdn/endpoints/0ac3dd96-3612-4638-a802-10dadc4aecea: 429 Too many requests
│
│   with module.spaces.digitalocean_cdn.assets,
│   on ../../modules/spaces/main.tf line 44, in resource "digitalocean_cdn" "assets":
│   44: resource "digitalocean_cdn" "assets" {
│
╵
Releasing state lock. This may take a few moments...

I'm literally blocked here and I don't understand why. Our tf stack is not big by any means. DO has extremely low rate limits everywhere (e.g. using Loki with spaces is impossible). We're getting random API errors from k8s API, spaces API, CDN API returns 50x every so often. It's getting really frustrating.

Oct 14 '21 07:10 atmosx

Same problem here.

So one thing you can possibly do is reach out to support and see if they can increase the limit for requests per minute.

I just tried that. The answer was simply that they are doing this "to protect the platform". So they won't increase the limits.

I now asked if they could implement at least something like a burst limit which would allow to overreach the rate limit for a short amount of time. That would probably help with use cases such as Terraform.

Oct 14 '21 07:10 djmaze

Same problem here.

So one thing you can possibly do is reach out to support and see if they can increase the limit for requests per minute.

I just tried that. The answer was simply that they are doing this "to protect the platform". So they won't increase the limits.

I now asked if they could implement at least something like a burst limit which would allow to overreach the rate limit for a short amount of time. That would probably help with use cases such as Terraform.

I'm the only one having this problem in a team of three. My coworkers can plan. I have issued tokens that are used by applications, I wonder if they're using the API and I'm hitting a global limit.

Since @djmaze and me are hit by this at the same point in time, I wonder if there's something else going on.

Oct 14 '21 07:10 atmosx

@djmaze works now my limit was reset apparently a few minutes but I'll be keeping an eye. Here's how you can see the rate:

curl -H "Authorization: Bearer $DIGITALOCEAN_ACCESS_TOKEN" -v -I "https://api.digitalocean.com/v2/images?private=true"

...
< ratelimit-limit: 5000
ratelimit-limit: 5000
< ratelimit-remaining: 4998
ratelimit-remaining: 4998
< ratelimit-reset: 1634198529
ratelimit-reset: 1634198529
...

API documentation: https://docs.digitalocean.com/reference/api/api-reference/#section/Introduction/Rate-Limit

Oct 14 '21 08:10 atmosx

I am getting this reproducibly when running our Terraform plan two times in a row during one or two minutes. The first run works, the second one fails. This was always like this and has not changed for me.

Oct 14 '21 08:10 djmaze

Posting this here in case anyone else hits the same issue. We're using a remote secrets store (doppler) and the problem was that the entire team and a few apps were using a common token, fetched by the secrets store.

Oct 14 '21 08:10 atmosx

Today this hit me for the first API request while refreshing terraform state, without me doing anything on DO except a Kubernetes “login”.

I saw @tback creating a possible fix for it; any chance it will be released soon? (The last release is from two days before the fix landed)

Jun 11 '22 04:06 gergelypolonkai

We too regularly run into the per-minute rate-limiting of API calls. As such we too would greatly appreciate a solution to this. Be it in the form of an option for increased rate limits on DO's side, the Terraform provider automatically retrying failed calls, or even just adding support for rate limiting to the provider. We'd much prefer a deployment taking a few minutes longer, over it potentially failing.

Some background on our use-case, should it help: We regularly spawn on the order of 100 VMs for a few hours at a time, in order to profile distributed applications of ours. These deployments are automated with Terraform. Each VM we provision will lead to multiple API calls, as we also set up DNS records for each, assign it to a project, and so on.

As we regularly get rate limited this then requires multiple calls to terraform apply - spaced apart a few minutes each time - for the stack to be deployed fully. This is inconvenient when doing it manually, and a nightmare in CI.

Feb 28 '23 10:02 Lavode

I just installed the provider manually with a provider override and it worked for me as long as it took me to migrate away from DO.

Feb 28 '23 10:02 tback

+1 Would pay extra to not have to worry about rate-limiting the API, which I use to pay provision resources and therefore pay Digital Ocean more money. If there was an option to pay $X/mo for a much higher/unlimited API, I would choose that option at this moment. My other option is to spend time building rate-limiting into my side of the app, which is going to take way more of my time than $X/mo.

Mar 03 '23 06:03 runako

I have similarly filed support requests, as my modestly sized terraform stack is encountering these issues. Unfortunately their own terraform provider isn't usable on their platform.

Perhaps if backoff was added to the provider it would slow things down but not impossibly fail?

Mar 12 '23 00:03 benyanke

I've hit the same problems of rate limits (to the point at which I couldn't even run a single plan anymore) and opened a PR with proposed changes that fixes/mitigate this issue. If anyone would like to take a look at the fork: DanielHLelis/terraform-provider-digitalocean-ratelimit. It basically limits the number of requests per second and/or uses the retryable HTTP client from Hashicorp to retry the request after an error (like the 422).

To try it you just need to install it and do the override as explained in the CONTRIBUTING.md.

Hoping to get it into mainstream soon.

Mar 28 '23 15:03 DanielHLelis

Can you link the PR here too?

Mar 28 '23 15:03 benyanke

Can you link the PR here too?

Here: #967

Mar 28 '23 15:03 DanielHLelis

Updating here to say that I spent over a week going back and forth with support, and they were not at all helpful. Making me wonder if I should move my DNS elsewhere, if DNS + IAC isn't going to be well supported.

Apr 08 '23 23:04 benyanke

We've just released version 2.28.0 of this provider. It adds experimental support for automatically retrying requests that fail with 429 or 500-level response codes. It can be enabled by setting the DIGITALOCEAN_HTTP_RETRY_MAX environment variable or the http_retry_max argument in the provider configuration.

Please let us know if you have any feedback on this functionality. We will be looking to enable it by default in a future release.

Additionally, it adds support for configuring client-side rate-limiting to enforce quality of service. It can be enabled by setting the DIGITALOCEAN_REQUESTS_PER_SECOND environment variable or the requests_per_second argument in the provider configuration.

Thanks to @DanielHLelis for working with us on this!

Apr 21 '23 16:04 andrewsomething

Thanks! This seems to be working initially - I'll do more testing and report back.

Apr 21 '23 18:04 benyanke

That's amazing @DanielHLelis! I've run the last release with 'requests_per_second' parameter and it appears to be working very well. I think that now I can merge all my terraform in one eheh

May 04 '23 20:05 madalozzo

I reached the rate limit too, after setting requests_per_second the error has gone. Looks like it's doing its job well :)

Aug 04 '23 10:08 jonathanheilmann

With the recently released version 2.30.0 of this provider, we have now enabled retries by default. Setting the DIGITALOCEAN_HTTP_RETRY_MAX environment variable or the http_retry_max argument in the provider configuration to 0 will disable this behavior.

Sep 11 '23 20:09 andrewsomething

terraform-provider-digitalocean terraform-provider-digitalocean copied to clipboard

429 Too many requests or 429 API Rate limit exceeded

Terraform Version

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

terraform-provider-digitalocean
terraform-provider-digitalocean copied to clipboard