terraform icon indicating copy to clipboard operation
terraform copied to clipboard

"terraform providers" command must succeed even if the lock file or provider cache directory is invalid

Open raffato opened this issue 3 years ago • 4 comments

Current Terraform Version

1.2.1

Use-cases

My company uses Apple and most of my team now have M1 Macbooks. I try to run terraform init but the codebase has 50+ instances of data "template_file".... Fixing this seems to be straight-forward - replace template_file data sources with templatefile() functions. But when I'm done, terraform init still fails because "something" still wants the template provider. I try to run terraform providers to identify what requires it, but this fails without giving any helpful information because the provider is missing. Duh! I wonder if sub-modules require it, so I unpin all module versions, but terraform init still fails with this kinda useless message:

Error: Required plugins are not installed
The installed provider plugins are not consistent with the packages selected in the dependency lock file:
 - registry.terraform.io/hashicorp/template: there is no package for registry.terraform.io/hashicorp/template 2.2.0 cached in .terraform/providers

Terraform uses external plugins to integrate with a variety of different infrastructure services. To download the plugins required for this configuration,
run:
  terraform init

TF_LOG=TRACE generates a ton of text but nothing that points to a root cause. After cursing Terraform for a day, the next morning (showerthoughs) it dawns on me that Terraform needs the provider because the current state references it. But I have to guess this because I can't run terraform providers.

Attempted Solutions

Tried to run terraform providers but it fails with this error:

╷
Error: Required plugins are not installed
The installed provider plugins are not consistent with the packages selected in the dependency lock file:
   - registry.terraform.io/hashicorp/template: there is no package for registry.terraform.io/hashicorp/template 2.2.0 cached in .terraform/providers

Terraform uses external plugins to integrate with a variety of different infrastructure services. To download the plugins required for this configuration, run:
   terraform init

Proposal

  1. Allow terraform providers to provide partial information! Surely it knows which providers are currently required, even if they're not installed? Should it not simply flag that it doesn't have specific version information because terraform init is required, instead of bailing out and providing nothing?
  2. If a plugin cannot be found, the same partial providers output would be useful in the terraform init error message. For example, a much more useful error message could look like this:
Error: Required plugins are not installed

The installed provider plugins are not consistent with the packages selected in the dependency lock file:
  - registry.terraform.io/hashicorp/template: there is no package for registry.terraform.io/hashicorp/template 2.2.0 cached in .terraform/providers

Providers required by configuration:
├── module.database
└── module.vault
    └── provider[registry.terraform.io/hashicorp/template]

Providers required by state:
    provider[registry.terraform.io/hashicorp/template]

Terraform uses external plugins to integrate with a variety of different infrastructure services. To download the plugins required for this configuration,
run:
   terraform init

References

https://github.com/hashicorp/terraform/issues/29993#issuecomment-1023787499

raffato avatar May 26 '22 11:05 raffato

Thanks for the issue report. Verified that the "Required plugins are not installed" error happens if you have a lockfile with providers from a non-matching architecture, then run terraform providers. It does seem that we can do better than this error and should be able to output a list of providers even in this case.

If you delete the lockfile, terraform providers will give the output you describe in the proposal, distinguishing between providers required by config and providers required by state. Is this a suitable workaround for your issue?

Tagged as bug because I think we can fix the output of terraform providers and make it more helpful in such cases.

kmoe avatar May 26 '22 14:05 kmoe

Yes, indeed it was an original design goal for terraform providers to make a best effort to describe the situation even if it's currently not totally valid, because it's primary purpose is to help with debugging problems like this.

It looks like when we later retrofitted the dependency lock file it introduced some new error cases we didn't know to expect when originally implementing that command, and so it currently fails on those. I think we should be able to make it more resilient so that e.g. if it detects any problems related to the provider cache consistency with the lock file it will still proceed as if there had been no lock file at all, and then emit the error diagnostics only after it's displayed the subset of information it was able to determine.

apparentlymart avatar May 26 '22 14:05 apparentlymart

Wasted a good amount of time today because grepping repo returned no results for pulling in the problem dependency. Turned out terraform providers has a section Providers required by state: that gives me the root of the problem.

This has to do with hashicorp/template in my case.

ezpuzz avatar Sep 22 '22 19:09 ezpuzz

I think the situation has changed a little since this issue was opened, but I don't think it's fully fixed.

What's changed is that the terraform init messaging should now recommend to run terraform providers to see where all of the dependencies are coming from.

However, I don't think we've yet made any changes to avoid terraform providers from failing when the lock file is inconsistent with the cache. I suggest that we consider this issue to represent that bug, and therefore it's fixed once we're sure that terraform providers can always produce at least a partial result even if the provider cache directory is incomplete or the lock file is somehow invalid.


I think there is a broader question here about whether it would be viable to ignore data resources in the state when deciding what the state depends on. I don't think we'll be able to address that here because it's a more invasive thing to change, but it is interesting to note that installin providers for data resources that only exist in the state is only in service of some relatively unimportant situations:

  • terraform console allows referring to data resource instances read on the previous run
  • terraform show shows on-screen the result of data resource instances read on the previous run

In particular, terraform apply (and terraform plan) don't need to involve a provider to deal with a data resource that has been removed from the configuration already, because the only reasonable action in that case is to remove the stale object from the state entirely and Terraform Core can just do that itself without any need to inspect the provider-specific result data.

I wonder about somehow loosening the design requirements for terraform console and terraform show so that they can be permitted to just assume any data resource that isn't in the configuration doesn't exist at all, even if it does happen to still exist in the state. It seems relatively unlikely that someone would really need to look at the stale previous result from a data resource they've now removed, but that's just a hunch on my part.

I don't think we should block on resolving this question in order to fix the terraform providers bug, but it would be nice to avoid this problem in the first place by not even trying to install the hashicorp/template provider once all of data blocks using it are removed from the configuration.

apparentlymart avatar Sep 22 '22 22:09 apparentlymart