terraform Error: Failed to read module directory after upgrading to terraform 1.2.7

Terraform Version

$ terraform version
Terraform v1.2.7
on darwin_amd64

Terraform Configuration Files

module "vpc_endpoints_nocreate" {
  source  = "terraform-aws-modules/vpc/aws//modules/vpc-endpoints"
  version = "3.7.0"

  create = false
}

Debug Output

2022-08-10T11:55:40.986-1000 [TRACE] modsdir: writing modules manifest to .terraform/modules/modules.json
╷
│ Error: Failed to read module directory
│
│ Module directory .terraform/modules/platform.vpc_endpoints_nocreate/modules/vpc-endpoints does not exist or cannot be read.

Expected Behavior

No errors during init.

Actual Behavior

Init failed with error show above.

Steps to Reproduce

terraform init -upgrade

Additional Context

References

I believe this may be related to https://github.com/hashicorp/terraform/pull/31573

Also see the public submodule in the registry at https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws/latest/submodules/vpc-endpoints

Note how there are two backslashes for the path for the source specified in the example.

Unfortunately, I don't have time to dig into it further right now.

Aug 10 '22 21:08 pickgr

FYI @radeksimko

Aug 10 '22 22:08 pickgr

Not the same error, but I believe the same problem:

Error: Unsupported argument

  on .terraform/modules/network/aws-network/vpc.tf line 8, in module "subnet_addrs":
   8:   base_cidr_block = var.base_cidr_block

An argument named "base_cidr_block" is not expected here.

Error: Unsupported argument

  on .terraform/modules/network/aws-network/vpc.tf line 9, in module "subnet_addrs":
   9:   networks = [

An argument named "networks" is not expected here.
Error: Unsupported argument

  on .terraform/modules/network/aws-network/vpc.tf line 8, in module "subnet_addrs":
   8:   base_cidr_block = var.base_cidr_block

An argument named "base_cidr_block" is not expected here.

Error: Unsupported argument

  on .terraform/modules/network/aws-network/vpc.tf line 9, in module "subnet_addrs":
   9:   networks = [

An argument named "networks" is not expected here.

When using https://registry.terraform.io/modules/hashicorp/subnets/cidr/latest

In terraform init it looks fine:

Downloading registry.terraform.io/hashicorp/subnets/cidr 1.0.0 for network.subnet_addrs...
- network.subnet_addrs in .terraform/modules/network.subnet_addrs

But .terraform/modules/network.subnet_addrs is empty...

Aug 10 '22 23:08 sblask

Solved with: 😅

    terraform {
        required_version = "> 1.0, < 1.2.7"
    }

Totally tongue in cheek but yea, I really had to do this - everything I ran was breaking.

╷
│ Error: Failed to read module directory
│ 
│ Module directory .terraform/modules/atlantis.alb_http_sg/modules/http-80 does not exist or cannot be read.
╵

That said, what was really interesting was that I downgrade, ran once and upgraded again and everything worked. This tells me it is exclusive to the downloader portion of the code.

Aug 11 '22 04:08 jensenbox

I haven't done any debugging but I don't see how this could be related to https://github.com/hashicorp/terraform/pull/31573 as that PR touched provider address validation, not module source. I will let someone from the Core team to chime in.

Aug 11 '22 05:08 radeksimko

Problem with v1.2.7 and I reverted back to v1.2.6 which works fine. terraform init seems to work but terraform plan has a bunch of "Unsupported argument" errors.

I am using the terraform-aws-modules/s3-bucket/aws module.

│ Error: Unsupported argument
│
│   on ../main.tf line 9, in module "s3_bucket":
│    9:   bucket = var.name
│
│ An argument named "bucket" is not expected here.
╵
╷
│ Error: Unsupported argument
│
│   on ../main.tf line 10, in module "s3_bucket":
│   10:   acl    = "private"
│
│ An argument named "acl" is not expected here.
╵
╷
│ Error: Unsupported argument
│
│   on ../main.tf line 12, in module "s3_bucket":
│   12:   attach_policy = var.policy == {} ? false : true
│
│ An argument named "attach_policy" is not expected here.
╵
╷
│ Error: Unsupported argument
│
│   on ../main.tf line 13, in module "s3_bucket":
│   13:   policy        = var.policy
│
│ An argument named "policy" is not expected here.
╵
╷
│ Error: Unsupported argument
│
│   on ../main.tf line 15, in module "s3_bucket":
│   15:   tags = merge(
│
│ An argument named "tags" is not expected here.
╵
╷
│ Error: Unsupported argument
│
│   on ../main.tf line 23, in module "s3_bucket":
│   23:   versioning = {
│
│ An argument named "versioning" is not expected here.
╵
╷
│ Error: Unsupported argument
│
│   on ../main.tf line 27, in module "s3_bucket":
│   27:   server_side_encryption_configuration = {
│
│ An argument named "server_side_encryption_configuration" is not expected here.
╵
╷
│ Error: Unsupported argument
│
│   on ../main.tf line 34, in module "s3_bucket":
│   34:   lifecycle_rule = var.enable_std_lifecycle == true ? [{
│
│ An argument named "lifecycle_rule" is not expected here.

Aug 11 '22 06:08 mesaugat

Terraform is totally useless with 1.2.7. Everything breaks

Aug 11 '22 08:08 pascalmtts

We experienced the same thing with the terraform-aws-modules/lambda/aws module as @mesaugat. This doesn't occur with version 1.2.6.

Aug 11 '22 09:08 AlexEndris

We are also experiencing the same problem with this module - terraform-aws-modules/ec2-instance (terraform-version=1.2.7)

Aug 11 '22 10:08 SuchismitaGoswami

We have similar issue. We use a module from the registry. Unfortunately, the module is downloaded, but the submodules used by that module are not, leading to half the module code not being there.

Aug 11 '22 13:08 thomaskvnze

Thanks everyone! We are currently investigating the issue.

Aug 11 '22 13:08 jbardin

Same issue for us running terraform version 1.2.7 and trying to download module terraform-aws-modules/iam/aws/modules/iam-role-for-service-accounts-eks

Aug 11 '22 16:08 tibz-enex

The problem appears to have originated from the registry and numerous incorrectly cached responses. Please let us know if there are any modules which continue to exhibit this behavior with v1.2.7.

Aug 11 '22 19:08 jbardin

The problem appears to have originated from the registry and numerous incorrectly cached responses. Please let us know if there are any modules which continue to exhibit this behavior with v1.2.7.

I'm still seeing this with the original module I reported?

╷
│ Error: Failed to read module directory
│
│ Module directory .terraform/modules/platform.vpc_endpoints_nocreate/modules/vpc-endpoints does not exist or cannot be read.
╵

Aug 11 '22 19:08 pickgr

Thanks @pickgr, I'll let them know not all the URLs have been purged.

Aug 11 '22 19:08 jbardin

Hello, we also have the problem with this module : https://registry.terraform.io/modules/terraform-aws-modules/alb/aws/6.10.0 Thx for your help @jbardin ;)

Aug 11 '22 19:08 jcolfej

Hi all! Thanks for reporting this incorrect behavior.

If you need to use a module that has an incorrect cache entry that hasn't yet been purged, I believe it should work to stay on Terraform CLI v1.2.6 for the moment (since the cached registry responses for that version are still correct) until the modules you need to use have had their caches purged.

Please do let us know if you've found a problem with a module that wasn't already mentioned above, though, so we can take full stock of the scope of this when we run a retrospective later. If possible it would be ideal to see exactly what you have in both the source and version arguments in your module block, just so that we can get a better sense of what is and is not affected.

As a little additional context about what seems to be going on here, for those who are following along with the details right now now, or those who might find this issue in future and wonder what was going on:

Terraform Registry implements the module registry protocol, which is essentially just an indirection over module sources that layers on the idea of there being multiple versions of each logical module. The registry is therefore really just an index of module packages published elsewhere, and doesn't truly host anything itself.

For the public Terraform Registry in particular, the "elsewhere" is GitHub repositories, and so when Terraform CLI asks registry.terraform.io a question like "Where can I find version 3.7.0 of terraform-aws-modules/vpc/aws?", the registry responds by returning a module package address just like you might've written directly into the source argument if you weren't using the registry, referring to a path in the underlying GitHub repository.

For reasons we're not yet quite sure about, it seems that the registry's CDN cache for certain module versions got "poisoned" with a legacy incorrect URL that doesn't correctly refer to the right directory within the module package. So far it seems that the cached response was an old-style URL to a source tarball on GitHub, and GitHub's source tarballs put the repository content into a subdirectory named after the repository rather than directly in the root, so the subdirectory path for the module mentioned in the leading comment would really be //terraform-aws-vpc-v3.7.0/modules/vpc-endpoints rather than just //modules/vpc-endpoints, and so when Terraform looked at the incorrect path the registry returned it found the directory missing.

We're still investigating what exactly happened here. At first blush it seems that a backward-compatibility heuristic somehow miscategorized Terraform v1.2.7 as an older version of Terraform requiring a protocol shim, and then that response got cached for certain modules. For the moment we're doing quick mitigation via cache purging but also need to track down the root cause for why the incorrect response had been returned in the first place. We're still looking into the root cause but in the mean time can purge specific modules that have incorrect caches in order to make them work again.

Aug 11 '22 19:08 apparentlymart

Please do let us know if you've found a problem with a module that wasn't already mentioned above, though, so we can take full stock of the scope of this when we run a retrospective later. If possible it would be ideal to see exactly what you have in both the source and version arguments in your module block, just so that we can get a better sense of what is and is not affected.

We see this problem with private modules that are hosted at github and referenced by their github url and tag. Works fine with 1.2.6, broken with 1.2.7 (with similar "An argument named "blah" is not expected here." errors)

Aug 11 '22 19:08 llamahunter

We're seeing this with https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/ since yesterday afternoon.

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "18.24.1"
  ...
}

Aug 11 '22 20:08 mconigliaro

If you are still looking for examples of what was failing...

module "log_bucket" {
  source = "registry.terraform.io/terraform-aws-modules/s3-bucket/aws"
}

No version applied.

Aug 11 '22 20:08 jensenbox

I know terraform-aws-modules/lambda/aws was mentioned above but either it wasn't purged or the purge is not enough to fix the problem. It's still failing on our completely clean CI runners with no local cache.

Aug 11 '22 20:08 acdha

I'm still seeing issues with https://registry.terraform.io/modules/terraform-aws-modules/rds-aurora.

source  = "terraform-aws-modules/rds-aurora/aws"
version = "7.2.2"

Aug 11 '22 20:08 mdodsworth

OK, we located the URLs which were known to be affected, and purging the cache for those is now complete. If you did init with a bad response, removing the .terraform directory to get rid of the cached modules may be necessary to force Terraform to re-download the correct URL.

Aug 11 '22 22:08 jbardin

OK, we located the URLs which were known to be affected, and purging the cache for those is now complete. If you did init with a bad response, removing the .terraform directory to get rid of the cached modules may be necessary to force Terraform to re-download the correct URL.

Still not working with our atlantis deploys. Is there some way to tell atlantis to clean its cache directory? Also, why not release 1.2.8 that fixes whatever cache corruption 1.2.7 introduced?

Aug 11 '22 22:08 llamahunter

OK, we located the URLs which were known to be affected, and purging the cache for those is now complete.

In this case, the caches being referred to here are the caches Terraform Registry uses to lookup the correct response to a query for a particular module+version. These are all "server-side" from the perspective of a Terraform CLI user. With the caches "purged," they are now returning the correct responses to the queries.

Also, why not release 1.2.8 that fixes whatever cache corruption 1.2.7 introduced?

The local .terraform folder cache is not corrupted per se, it may just contain incorrect URL data. I am not sure that it would be possible to detect which URLs were incorrect in any reliable way. This is a case in which you likely know if you are impacted by the issue or not, and can enact the remediation on your end.

Unfortunately, as we do not develop or maintain Atlantis, I do not believe we know how to clear the cache folder for Atlantis. (edit: "we" as in the Terraform Core team). Thanks for your questions!

Aug 11 '22 23:08 crw

I've confirmed everything is working for me with 1.2.7 now. Note that running terraform init -upgrade may be an alternative to manually removing the .terraform directory.

Thanks for the quick turnaround everyone!

Aug 12 '22 00:08 pickgr

I think my comment above may have created some confusion when considered in conjunction with the other kind of "cache" some are discussing here, so just to clarify:

In my case, I was discussing the remote cache living in the CDN that provides the Terraform Registry service. That cache is under the control of our Terraform Registry team and so they are able to proactively purge it; that is what @jbardin was meaning above when he said that we have purged the caches.

Terraform CLI, in terraform init, also saves itself a local manifest file to remember what it has installed. That file is an implementation detail but in current Terraform is a JSON file living under the .terraform directory, which includes (amongst other things) the "subdirectory" path within the locally-cached module package to use for that particular module, which deals with the fact that a module package can potentially contain many different modules, and Terraform needs to "remember" which one to use when you subsequently run terraform apply.

Since this issue effectively caused the registry to report incorrect subdirectory paths, it seems like some of you now have incorrect subdirectory paths in the local manifest file too. We cannot proactively purge that because it's on your own local computers, but as @pickgr noted one way to deal with it is to run terraform init -upgrade since the -upgrade option effectively forces terraform init to ignore what's in the manifest file. However, -upgrade also causes Terraform to ignore .terraform.lock.hcl and might thereby also perform unwanted provider upgrades, if you're relying on the dependency lock file to retain your currently-selected provider versions.

A more "surgical" answer, focusing only on modules, is to delete the .terraform/modules directory and all of its contents before you run terraform init, which will then remove both the JSON manifest file and your local caches of the modules, thereby forcing Terraform to reinstall them from the now-purged remote registry cache, which should hopefully therefore lead to recreating the manifest file with the correct paths.

Aug 12 '22 02:08 apparentlymart

Is there some way to tell atlantis to clean its cache directory?

atlantis unlock will clear out the workspace on disk, including .terraform dir.

Aug 14 '22 04:08 arohter

Is there some way to tell atlantis to clean its cache directory?

atlantis unlock will clear out the workspace on disk, including .terraform dir.

I had tried that, but it seemed to still be poisoning the cache. Or maybe it wasn't 'fixed' yet?

Aug 15 '22 15:08 llamahunter

Closing since the registry issue was resolved and there have been no further incident reports.

Aug 29 '22 16:08 jbardin

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Sep 29 '22 02:09 github-actions[bot]

terraform terraform copied to clipboard

Error: Failed to read module directory after upgrading to terraform 1.2.7

Terraform Version

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Additional Context

References

terraform
terraform copied to clipboard