OpenTofu Incorrectly Reports Backend Configuration has Changed When Using Early Evaluation
OpenTofu Version
❯ tofu version
OpenTofu v1.8.2
on darwin_arm64
OpenTofu Configuration Files
main.tofu
terraform {
required_version = "~> 1.8.0"
backend "gcs" {
bucket = "${var.project}-tfstate"
}
}
variable "project" {
description = "The ID of the GCP Project we will use for state management"
type = string
}
dev.tfvars - Substitute <my_project_id> with your own value
project = "<my_project_id>"
Debug Output
https://gist.github.com/elliott-weston-ki/48933cc224e56e316e3db3459e15fdba
Expected Behavior
The Backend should successfully initialise after running the same init command.
Actual Behavior
OpenTofu returns an error, and reports that the backend configuration has changed.
Steps to Reproduce
- tofu init -var-file=dev.tfvars -backend-config="prefix=elliott/debug-tofu"
- tofu init -var-file=dev.tfvars -backend-config="prefix=elliott/debug-tofu"
Shell Output from those steps:
❯ tofu init -var-file=dev.tfvars -backend-config="prefix=elliott/debug-tofu"
Initializing the backend...
Successfully configured the backend "gcs"! OpenTofu will automatically
use this backend unless the backend configuration changes.
Initializing provider plugins...
OpenTofu has been successfully initialized!
You may now begin working with OpenTofu. Try running "tofu plan" to see
any changes that are required for your infrastructure. All OpenTofu commands
should now work.
If you ever set or change modules or backend configuration for OpenTofu,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
❯ tofu init -var-file=dev.tfvars -backend-config="prefix=elliott/debug-tofu"
Initializing the backend...
╷
│ Error: Backend configuration changed
│
│ A change in the backend configuration has been detected, which may require migrating existing state.
│
│ If you wish to attempt automatic migration of the state, use "tofu init -migrate-state".
│ If you wish to store the current configuration with no changes to the state, use "tofu init -reconfigure".
Additional Context
- I have found that if I delete the local state file (in
.terraform/terraform.tfstate) then I am able to re-initialise again. - Contents of this state file is below, I have redacted part of the bucket name as it contains my GCP Project ID, but it was set correctly:
{
"version": 3,
"serial": 1,
"lineage": "46c5c5b4-5b0f-2922-787d-b50fbaf9d72d",
"backend": {
"type": "gcs",
"config": {
"access_token": null,
"bucket": "REDACTED-tfstate",
"credentials": null,
"encryption_key": null,
"impersonate_service_account": null,
"impersonate_service_account_delegates": null,
"kms_encryption_key": null,
"prefix": "elliott/debug-tofu",
"storage_custom_endpoint": null
},
"hash": 2674230356
},
"modules": [
{
"path": [
"root"
],
"outputs": {},
"resources": {},
"depends_on": []
}
]
}
- The state file was successfully created in the remote backend
Contents of this file:
{
"version": 4,
"terraform_version": "1.8.2",
"serial": 1,
"lineage": "8cdb9b66-56c3-8d84-baca-02095fb74882",
"outputs": {},
"resources": [],
"check_results": null
}
References
No response
Hello and thank you for this issue! The core team regularly reviews new issues and discusses them, but this can take a little time. Please bear with us while we get to your issue. If you're interested, the contribution guide has a section about the decision-making process.
In case it's useful to someone looking into this in future:
Note that, despite the confusing name, .terraform/terraform.tfstate hasn't really been a "state file" since v0.9. In that release, it switched to being just a place for tofu init to write the finalized backend configuration (including any -backend-config arguments that wouldn't otherwise be persisted anywhere) for other commands to use.
It retains the same filename as was historically used for state, and uses the same structure that was current for the state snapshot format in v0.9 (note format version 3 instead of 4), but those are both just pragmatic decisions to make it easier to migrate from v0.8 since the upgrade process would wholesale replace the previous "real" state with the new not-actually-a-state without any possibility of tricky intermediate steps where both files might be present.
That has unfortunately left us in a pretty confusing situation, so I hope this note is helpful to prevent a future reader from being confused by it. :grinning:
I've reproduced this locally on my system using the local backend instead of the gcs backend, since that means we can test without needing any live credentials for a particular remote service.
My version of the configuration:
variable "state_file_basename" {
type = string
default = "terraform"
}
terraform {
backend "local" {
path = "${var.state_file_basename}.tfstate"
}
}
My reproduction steps:
-
tofu init -var=state_file_basename=foo -backend-config=workspace_dir=whatever -
tofu init -var=state_file_basename=foo -backend-config=workspace_dir=whatever
Some other details I noticed while experimenting:
- The
-backend-configargument seems to be the crucial piece for hitting this bug. Using justtofu init -var=state_file_basename=foobehaves idempotently as expected. - However, having a static-eval variable in the backend configuration is also required. If I remove the
var.state_file_basenameinterpolation and just hard-codefoo.tfstatethen it also behaves idempotently. - The bug also occurs if I leave
-var=state_file_basename=...unset on the command line and just let it use the default.
So altogether then it seems like it's the combination of static evaluation and CLI-overridden backend configuration that causes the problem.
I think I've found the root cause of this problem.
During backend initialization, OpenTofu re-evaluates the configuration as part of Meta.backendConfigNeedsMigration, which is the function that is ultimately responsible for checking whether the configuration is still the same as it was at original init. It evaluates the configuration here:
https://github.com/opentofu/opentofu/blob/de69070b0279f23216f24b499a06e0a2dda44559/internal/command/meta_backend.go#L1365-L1371
Notice that this snippet ends with a trace log that's also in the log included in the original issue:
2024-09-27T08:45:12.017+0100 [TRACE] backendConfigNeedsMigration: failed to decode given config; migration codepath must handle problem: main.tofu:5,17-20: Variables not allowed; Variables may not be used here.
This error occurs because the third argument to hcldec.Decode is nil. That is the hcl.EvalContext which decides which variables and functions are available to use during expression evaluation, and setting it to nil is how you tell HCL that only literal values are allowed in this location, and so it treats the var.state_file_basename in my example, and the var.project in the original example, as invalid.
It seems like this particular call was missed when retrofitting the backend codepaths to use the hcl.EvalContext from the static evaluation system. Therefore OpenTofu is treating this the same as any other error in the backend configuration; we find a similar error if I assign an empty tuple to the path argument, since that's a type mismatch error against the backend's config schema:
2024-10-08T16:30:05.966-0700 [TRACE] backendConfigNeedsMigration: failed to decode given config; migration codepath must handle problem: backend-config-changed.tf:8,11-13: Incorrect attribute value type; Inappropriate value for attribute "path": string required.
[...]
╷
│ Error: Backend configuration changed
│
│ A change in the backend configuration has been detected, which may
│ require migrating existing state.
│
│ If you wish to attempt automatic migration of the state, use "tofu
│ init -migrate-state".
│ If you wish to store the current configuration with no changes to the
│ state, use "tofu init -reconfigure".
╵
I expect that the fix here will be to modify this codepath in a similar way to how all of the other backend-config-eval codepaths were modified to implement this feature, so that the migration-check code will find exactly the same configuration content that the initial init would've reacted to.
(This only affects the case where -backend-config is present because when there are no CLI-based overrides the checking uses a simpler path where it checks the previously-stored configuration with the current configuration using a hash. We perform this extra re-evaluation in the -backend-config-present codepath because we're trying to check just the configuration written in the backend block, ignoring the -backend-config arguments, whereas the hash used for the initial check already has the CLI arguments incorporated into it and so can't be used directly in the override case.)