terragrunt
terragrunt copied to clipboard
Misleading error log on 429 errors from registry
Hey,
Recently we have been throttled by hashicorp registry (As we confirmed by talking with their support) But it was impossible to understand from our error logs (even when enabling debug)
we are running terragrunt plan commands on high scale and this is the error log that we got:
13:14:42.342 INFO Terragrunt Cache server is listening on 127.0.0.1:38691
14513:14:42.342 INFO Start Terragrunt Cache server
14613:14:42.752 INFO Downloading Terraform configurations from git::ssh://[[email protected]](mailto:[email protected])/<organization>/<repo>.git?ref=lambda_v1.1.37 into ./.terragrunt-cache/uCvmXUGu5jL2E_fX-K1egZAxmTs/bkwVI7uwl97tkFlbhBGmEdngoJk
14713:14:43.657 INFO Caching terraform providers for ./.terragrunt-cache/uCvmXUGu5jL2E_fX-K1egZAxmTs/bkwVI7uwl97tkFlbhBGmEdngoJk/lambda
14813:14:44.355 ERROR terraform invocation failed in ./.terragrunt-cache/uCvmXUGu5jL2E_fX-K1egZAxmTs/bkwVI7uwl97tkFlbhBGmEdngoJk/lambda
14913:14:44.355 INFO Shutting down Terragrunt Cache server...
15013:14:44.355 INFO Terragrunt Cache server stopped
15113:14:44.355 ERROR 3 errors occurred:
152
153* unable to cache provider: registry.terraform.io/hashicorp/external v2.3.4, err: not found provider download url
154
155* unable to cache provider: registry.terraform.io/hashicorp/local v2.5.2, err: not found provider download url
156
157* unable to cache provider: registry.terraform.io/hashicorp/null v3.2.3, err: not found provider download url
the error here has not information about rate-limiting or the status code of 429, and makes it much harder to recognize and troubleshoot
we saw this error message due to this code: https://github.com/gruntwork-io/terragrunt/blob/main/tf/cache/services/provider_cache.go#L218-L236
it doesnt say anything about being 429 and without hashicorp support we wouldn't understand this.
Steps To Reproduce
Get 429 from hashicorp (as their support states - something around 3000 requests in 5 minutes is enough to get 429 error)
also, have terragrunt use their caching mechanism:
export TERRAGRUNT_PROVIDER_CACHE=1
0.69.9 and 0.71.1 - its the same error message.
Expected behavior
We want a clear message that says - 429 Rate-limiting just like it would say if we would use terraform.
Versions
- Terragrunt version: 0.69.9 / 0.71.1 but you can also use latest as i see code haven't changed.
- OpenTofu/Terraform version: 1.10.1
- Environment details (Ubuntu 20.04, Windows 10, etc.): Ubuntu 20.04
I definitely agree that the error message could be better.
Requesting community contributions, as you've already done some root cause analysis, and it shouldn't be too hard for someone from the community to submit a fix.
If I switch to not using provider cache, I can successfully download the provider. How am I getting throttled, then?
Hi, I'm encountering the same error since upgrading from terragrunt version 0.77.20 to version 0.77.22. I'm running terragrunt in a GitHub Action, and the provider cache is enabled. By disabling the provider cache, the execution time moves from ~15 minutes to more than one hour, which is not a viable option for my use case. This is the log:
$ terragrunt run-all plan --non-interactive --provider-cache --working-dir / --queue-exclude-dir /package-excluded --parallelism 16.
09:55:12.280 INFO Terragrunt Cache server is listening on 127.0.0.1:34695
09:55:12.281 INFO Start Terragrunt Cache server
09:55:12.281 WARN [package-1] Using `terragrunt.hcl` as the root of Terragrunt configurations is an anti-pattern, and no longer recommended. In a future version of Terragrunt, this will result in an error. You are advised to use a differently named file like `root.hcl` instead. For more information, see https://terragrunt.gruntwork.io/docs/migrate/migrating-from-root-terragrunt-hcl
09:55:12.327 INFO The stack at . will be processed in the following order for command plan:
Group 1
- Module ./package-1
- Module ./package-2
- Module ./package-3
09:55:12.483 INFO [package-3] Downloading Terraform configurations from ../../../infrastructure/aws/modules/package-3 into ./package-3/.terragrunt-cache/x0y2mDHkVl_YWrxR4X4DOxEi2W8/0Crg5bhSe0-tVGKO1jqBTKWtLz8
09:55:12.705 INFO [package-3] Caching terraform providers for ./package-3/.terragrunt-cache/x0y2mDHkVl_YWrxR4X4DOxEi2W8/0Crg5bhSe0-tVGKO1jqBTKWtLz8
09:55:12.713 INFO [package-2] Downloading Terraform configurations from ../../../infrastructure/aws/modules/package-2 into ./package-2/.terragrunt-cache/mDwY25I5tlSFoLuoEhRm5IPj-CU/wqjnzHi4nj1rYZe7PfusmikcUA0
09:55:12.738 INFO [package-1] Downloading Terraform configurations from ../../../infrastructure/aws/modules/empty into ./package-1/.terragrunt-cache/zXezC686oBq6SvexVuBiO7c9OvA/zYUClprxT3148oUC4lAKDoP8O1w
09:55:12.761 INFO [package-2] Caching terraform providers for ./package-2/.terragrunt-cache/mDwY25I5tlSFoLuoEhRm5IPj-CU/wqjnzHi4nj1rYZe7PfusmikcUA0
09:55:12.788 INFO [package-1] Caching terraform providers for ./package-1/.terragrunt-cache/zXezC686oBq6SvexVuBiO7c9OvA/zYUClprxT3148oUC4lAKDoP8O1w
09:55:13.070 ERROR [package-3] terraform invocation failed in ./package-3/.terragrunt-cache/x0y2mDHkVl_YWrxR4X4DOxEi2W8/0Crg5bhSe0-tVGKO1jqBTKWtLz8
09:55:13.070 ERROR [package-3] Module ./package-3 has finished with an error
09:55:13.110 ERROR [package-2] terraform invocation failed in ./package-2/.terragrunt-cache/mDwY25I5tlSFoLuoEhRm5IPj-CU/wqjnzHi4nj1rYZe7PfusmikcUA0
09:55:13.110 ERROR [package-2] Module ./package-2 has finished with an error
09:55:13.118 ERROR [package-1] terraform invocation failed in ./package-1/.terragrunt-cache/zXezC686oBq6SvexVuBiO7c9OvA/zYUClprxT3148oUC4lAKDoP8O1w
09:55:13.118 ERROR [package-1] Module ./package-1 has finished with an error
09:55:13.119 INFO Shutting down Terragrunt Cache server...
09:55:13.119 INFO Terragrunt Cache server stopped
09:55:13.119 ERROR 3 errors occurred:
* unable to cache provider: registry.terraform.io/hashicorp/aws v5.96.0, err: not found provider download url
* unable to cache provider: registry.terraform.io/hashicorp/aws v5.96.0, err: not found provider download url
* unable to cache provider: registry.terraform.io/hashicorp/aws v5.96.0, err: not found provider download url
09:55:13.119 ERROR Unable to determine underlying exit code, so Terragrunt will exit with error code 1
Thanks for the help!
UPDATE Switching from version 0.77.20 to 0.77.22 doesn't have any impact on the issue, but after rolling back the AWS Terraform provider to version 5.95.0, everything worked as expected
we eventually spoke with hashicorp and they indeed confirmed we were throttled on several occasions.
the problem is not with either some provider throttles or not, the problem is that there is no visibility from the log even on debug level that the status code is 429 and its misleading - because err: not found provider download url is not the actual issue.
my issue turned out to be the cloudfront cache issue with the aws provider v5.96.0 :(. Bottom line is we need better logging in the provider cache server.
my issue turned out to be the cloudfront cache issue with the aws provider v5.96.0 :(. Bottom line is we need better logging in the provider cache server.
We had the same issue. Now I can confirm I'm able to download the newer version of the provider even with the provider cache enabled
Hi! I'm having the same error with the hashicorp/terraform-provider-archive provider. I understand this is not strictly related to Terragrunt, but I don't understand what happens when disabling the terragrunt provider cache. Basically, when I use the option --provider-cache I see the error registry.terraform.io/hashicorp/archive v2.7.1, err: not found provider download url. If I don't use that option, everything works as expected, slower, but it works. May I ask for a clarification?
Thanks!
Any update on this issue?, as it's not consistently happening, i'm using terragrunt run --all apply --provider-cache sometimes it passes and sometimes it fails
Terragrunt version: v0.78.4 Terraform version: 1.12.0
is there any workarounds to cache providers across mutliple modules (e.g. registry.terraform.io/hashicorp/aws v5.98.0) other than the Terragrunt Provider Cache Server?
Thanks!
Hey folks,
It looks like this is an issue related to the Terraform provider registry, not anything that the Terragrunt or Terraform binaries are doing. It seems to be some misconfiguration on their CDN caching strategy, and as such, we won't be able to give any updates on this issue.
If someone has evidence to the contrary, we can look into it.
EDIT: For what it's worth, nobody using OpenTofu is reporting this issue, so it might be worth exploring using the OpenTofu registry instead.
Thank you for this issue! I've been encountering this with the kubernetes provider v2.38.0 which was released 2 days ago.
The speculation on the CDN misconfiguration for the release nudged my towards pinning to v2.37.1 and that resolved my error messages. Thanks @yhakbar for the comment that resolved my issue.
Terragrunt version: v0.76.6 Terraform version: v.1.12.2