terraform icon indicating copy to clipboard operation
terraform copied to clipboard

v1.8 says Backend configuration block has changed when it hasn't

Open jorhett opened this issue 10 months ago • 12 comments

Terraform Version

Terraform v1.8.0
on darwin_arm64
+ provider registry.terraform.io/hashicorp/aws v5.43.0

Terraform Configuration Files

terraform {
  backend "s3" {
    bucket         = "terraform-state"
    region         = "us-west-1"
    key            = "test/terraform.tfstate"
  }
}

Debug Output

N/A

Expected Behavior

$ terraform plan test

Acquiring state lock. This may take a few moments...

Actual Behavior

$ terraform plan test

 Error: Backend initialization required: please run "terraform init"
│
│ Reason: Backend configuration block has changed
│
The "backend" is the interface that Terraform uses to store state,
...

Steps to Reproduce

  1. Make a small component with an s3 backend, apply with Terraform 1.7
  2. Switch to Terraform 1.8 and try to generate a plan

Additional Context

No mention of this change is had in the upgrade guide or the changelog.

References

No response

jorhett avatar Apr 10 '24 23:04 jorhett

Hi @jorhett! Sorry for this misbehavior, and thanks for reporting it.

From your reproduction steps, it sounds like you ran terraform init with Terraform v1.7, and then later ran terraform apply in the same directory with Terraform v1.8, without reinitializing the directory using Terraform v1.8. Is that true?

If so, it would probably help to run terraform init with Terraform v1.8 so it'll have an opportunity to rebuild the cached backend configuration for the working directory to incorporate the changes related to removing "the legacy workflow".

Alternatively, you could start with a fresh working directory (without any existing .terraform subdirectory) and that should produce essentially the same effect: the cached backend configuration will have been created based on the backend's schema from v1.8, and so should match with how the backend is now interpreting what's in your configuration.

If that does fix it for you, then we can change the upgrade guide to mention this additional hazard when reusing a pre-existing working directory. Otherwise, I'll leave this to the S3 provider team (who maintains this backend) for further investigation.

Thanks!

apparentlymart avatar Apr 11 '24 00:04 apparentlymart

Hi @apparentlymart , I had the same problem and both of your proposed solutions are working. Thanks!

When keeping the .terraform/ directory, I had to run terraform init -reconfigure.

After deleting the .terraform/ directory, terraform init can be used without the -reconfigure flag.

marcelfrey29 avatar Apr 11 '24 05:04 marcelfrey29

@apparentlymart No in this case ran init/plan/apply with v7 and then came back and tried to make a plan with v1.8 and was forced to go terraform init -reconfigure

This is breaking the automation toolset for EVERY component in our organization, requiring manual intervention for each and every one... when there was zero change to the code.

Why can't the legacy workflow be removed without forcing every single person to hop out of the car, pop the hood, and apply a change that they didn't make?

At the very least, this should be mentioned in the changelog and upgrade notes. It isn't mentioned in either. But better yet, let's not tell people they changed something which they did not change? At the very least, can you own up to the problem?

Hey we changed the backend config in a backwards-incompatible way, sorry it's not you it was us. Please go run terraform init -reconfigure to make everything happy again.

jorhett avatar Apr 11 '24 06:04 jorhett

To be on point here, we cannot just change a release process to automatically run terraform init -reconfigure every time because if the change includes a location change and they need to -migrate-state but instead did -reconfigure then POOF all state is gone.

So if you really are saying that we have to re-init every root module any time a new Terraform version ships, then please give us some command by which we can confirm the backend DID NOT CHANGE and that a blind -reconfigure would be safe, or some other programmatic way that doesn't involve sending people to run raw commands in Terraform modules.

jorhett avatar Apr 11 '24 07:04 jorhett

Also, the page you linked to spoke vaguely about the idea with zero mention of impact

Terraform v1.8 completes this deprecation process by removing the use_legacy_workflow argument. The old behavior is no longer available, and so you will need to adopt the new behavior when upgrading to Terraform v1.8.

We have never in the history of our repo (10+ years of Terraform!) used this argument. Therefore it's hard to understand how I'm supposed to know that this sentence means I must manually reinitialize every module before I can plan again.

$ grep -h use_legacy_workflow */.terraform/* 2> /dev/null |sort | uniq
            "use_legacy_workflow": null,

jorhett avatar Apr 11 '24 07:04 jorhett

I'm running into the very same issue as @jorhett described in his previous comments, but wanted to echo all of his concerns and surprise that this wasn't mention clearly in the release notes for Terraform 1.8:

  • Our organization has never used the use_legacy_workflow flag
  • All Terraform plans fail until a manual terraform init -reconfigure command has been executed (we're talking hundreds different broken plans)

This currently breaks all automations we have in place and requires a manual intervention for all Terraform configs.

danielhanold avatar Apr 16 '24 15:04 danielhanold

I'm seeing the same problem and I've also never used use_legacy_workflow.

Should we expect to see a patch that will allow us to avoid having to reinitialise state?

surskitt avatar Apr 18 '24 12:04 surskitt

Having the same behavior when switching from Terraform 1.7.5 to 1.8.2.

╷
│ Error: Backend initialization required: please run "terraform init"
│ 
│ Reason: Backend configuration block has changed
│ 
│ The "backend" is the interface that Terraform uses to store state,
│ perform operations, etc. If this message is showing up, it means that the
│ Terraform configuration you're using is using a custom configuration for
│ the Terraform backend.
│ 
│ Changes to backend configurations require reinitialization. This allows
│ Terraform to set up the new configuration, copy existing state, etc. Please
│ run
│ "terraform init" with either the "-reconfigure" or "-migrate-state" flags
│ to
│ use the current configuration.
│ 
│ If the change reason above is incorrect, please verify your configuration
│ hasn't changed and try again. At this point, no changes to your existing
│ configuration or state have been made.

Running terraform init -reconfigure or terraform init -migrate-state doesn't change anything. Switching back to terraform 1.7.5 keeps working as expected.

We tried removing the .terraform and the lock.file too. Same error happen.

Mmasson-01 avatar May 07 '24 20:05 Mmasson-01

Another way to simply keep working is to remove the offending field from the local .tfstate file. No need for terraform init at all.

sed -i 's/"use_legacy_workflow": null,//' .terraform/terraform.tfstate 

YMMV. Use at your own risk!

rwunderer avatar Jun 05 '24 15:06 rwunderer

The same error is shown for Azure storage account backend with terraform 1.8.5

Any hints or fix @apparentlymart ?

vanmash avatar Jun 19 '24 08:06 vanmash

Adding the parameter below seemed to solved the problem for me

backendAzureRmUseEnvironmentVariablesForAuthentication: true

- task: TerraformTaskV4@4
  displayName: 'init'
  inputs:
    provider: 'azurerm'
    command: 'init'
    workingDirectory: '$(System.DefaultWorkingDirectory)/terraform/'
    backendAzureRmUseEnvironmentVariablesForAuthentication: true
    backendServiceArm: '$(servicePrincipal)'
    backendAzureRmResourceGroupName: '$(backendRG)'
    backendAzureRmStorageAccountName: '$(backendSA)'
    backendAzureRmContainerName: '$(backendContainer)'
    backendAzureRmKey: '$(backendKey)'

vanmash avatar Jun 19 '24 21:06 vanmash

I think the extra terraform init -reconfigure is required because the removal of the deprecated attribute caused the schema to change, so it affects all S3 backend users rather than just the ones who used the attribute in question. We'll update the upgrade guide to mention this.

nfagerlund avatar Jun 24 '24 20:06 nfagerlund

@nfagerlund @apparentlymart I really think that changes which break thousands of terraform components should not be pushed in a non-breaking version. How hard would it have been to simply ignore and remove this from the state file? It caused a visible outage for every company who runs terraform inside another tool. (a wrapper that provides input data, etc)

It is NOT safe to blindly run terraform init -reconfigure on every invocation. That is the kind of thing that needs human 👀 to ensure the changes are correct. But when you require that attention because you want to remove a field in the file, when you ask people to run this command to clean up for you, you're forcing them to pay attention to something that needs no attention.

Keep doing this, and people will pay less attention... and cause major breakage. Or frankly, migrate to the alternative which doesn't have this kind of breakage in a non-breaking version. This breakage caused the first serious discussion about whether we should use the alternative.

jorhett avatar Jul 17 '24 17:07 jorhett

Thanks for that feedback!

As the upgrade guide now has the proper guidance, I am going to close this issue. Thanks again.

crw avatar Jul 17 '24 18:07 crw

This hasn't resolved the problem for which it was opened. Documenting the change wasn't the ask @crw. Fixing the code to not have a breaking change was the request of this issue.

jorhett avatar Jul 18 '24 03:07 jorhett

As I understand it and after re-reading this issue, this is a one-time issue that was caused by migrating the S3 backend to the new AWS SDKv2. The upgrade to SDKv2 necessitated the schema change. The purpose of the upgrade guide is to document steps needed to be taken to migrate from an older version of Terraform to a newer version of Terraform. Per this ticket, upgrade guide was missing information which led to this problem.

I will bring this issue up in triage again to make sure we feel we have reached a solution on this issue. Thanks for your continued feedback!

crw avatar Jul 18 '24 14:07 crw

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

github-actions[bot] avatar Aug 18 '24 02:08 github-actions[bot]