terraform icon indicating copy to clipboard operation
terraform copied to clipboard

Creating a new workspace with `terraform workspace new -state=tf.state.default` does not work for s3 remote state.

Open dnozay opened this issue 4 years ago • 2 comments

Terraform Version

╰─ terraform version
Terraform v1.0.9
on darwin_amd64
+ provider registry.terraform.io/hashicorp/aws v3.37.0
+ provider registry.terraform.io/hashicorp/random v3.1.0

Terraform Configuration Files

terraform {
  backend "s3" {
    bucket               = "my-tfstate-bucket"
    key                  = "kubernetes-infra.tfstate"
    region               = "us-west-2"
    workspace_key_prefix = "tf-state"
    profile              = "test-account/administrator"
  }
}

Debug Output

2021-10-27T16:38:34.408-0700 [DEBUG] [aws-sdk-go] DEBUG: Request s3/PutObject Details:
---[ REQUEST POST-SIGN ]-----------------------------
PUT /tf-state/dnozay_testing/kubernetes-infra.tfstate HTTP/1.1
Host: xxxxxx
User-Agent: aws-sdk-go/1.40.25 (go1.16.4; darwin; amd64) APN/1.0 HashiCorp/1.0 Terraform/1.0.9
Content-Length: 155
Authorization: xxxxxx
Content-Md5: xxxxxx
Content-Type: application/json
X-Amz-Content-Sha256: xxxxxx
X-Amz-Date: xxxxxx
X-Amz-Security-Token: xxxxxx
Accept-Encoding: gzip

{
  "version": 4,
  "terraform_version": "1.0.9",
  "serial": 0,
  "lineage": "xxxxxx-xxxxxx-xxxxxx-xxxxxx-xxxxxx",
  "outputs": {},
  "resources": []
}

Expected Behavior

Creating a new workspace should work and use input state as base.

terraform state pull > tf.state.default
terraform workspace new -state=tf.state.default ${USER}_testing
  • This issue may be specific to s3 backend / s3 remote state
  • I do not see this issue when using gcs remote state

Actual Behavior

  • Initialized with empty state.

Steps to Reproduce

terraform workspace select default
terraform state pull > tf.state.default
ls -lh tf.state.default
echo "number of resources=$(terraform state list | wc -l)"
terraform workspace delete -force ${USER}_testing
TF_LOG=trace terraform workspace new -state=tf.state.default ${USER}_testing
echo "number of resources=$(terraform state list | wc -l) 😭😭😭"
terraform state push tf.state.default
echo "number of resources=$(terraform state list | wc -l)"

Additional Context

without TF_LOG=trace

Switched to workspace "default".
-rw-r--r--  1 dnozay  staff   281K Oct 27 16:45 tf.state.default
number of resources=     205
Deleted workspace "dnozay_testing"!
WARNING: "dnozay_testing" was non-empty.
The resources managed by the deleted workspace may still exist,
but are no longer manageable by Terraform since the state has
been deleted.

Created and switched to workspace "dnozay_testing"!

You're now on a new, empty workspace. Workspaces isolate their state,
so if you run "terraform plan" Terraform will not see any existing state
for this configuration.
number of resources=       0 😭😭😭
number of resources=     205

As you can see in the steps, workaround is to explicitly push the state

terraform state push tf.state.default

However, this is using the same lineage which could be a problem. In that regard, S3 backend and GCS backend are not working the same.

dnozay avatar Oct 27 '21 23:10 dnozay

Thanks for reporting this, @dnozay.

For most commands -state=... is a legacy option for the local backend only, but it seems like it intentionally has a different meaning for terraform workspace new, because that command is handling the option inline itself rather than passing it over to the backend as other commands do:

https://github.com/hashicorp/terraform/blob/de105595e2788b5614081a295268cdb75964ee06/internal/command/workspace_new.go#L140-L163

(statePath in the above is what the -state=... option gets decoded into.)

The logic here seems to be backend-agnostics:

  1. Read the given file into memory as a state file object.
  2. Write the state file object directly to the "state manager", which is an interface that all of the backends implement. This'll typically just update the in-memory structure to consider this new snapshot to be the current snapshot.
  3. Persist the current snapshot in the state manager, which typically means to serialize the current snapshot back to the state file serialization and write it to whatever remote storage we're talking about. (e.g. S3 or GCS)

With that said then, it's not clear to me why this behavior would be different depending on which backend you've selected and I wonder if something else was confounding things here that made it seem like the GCS backend behaved differently. I'm going to reclassify this as a general CLI bug for the moment to recognize that, since I think we ought to try to prove it as being an S3-backend-specific issue before we pass it over to the AWS provider team (who maintains that backend).

apparentlymart avatar Oct 29 '21 22:10 apparentlymart

As mentioned in the repro scenario

terraform workspace new -state=tf.state.default ${USER}_testing
terraform state list  | wc -l

shows no resources when using s3 remote state; I've also tried with gcs, and that worked much better.

dnozay avatar Nov 02 '21 05:11 dnozay

This is still an issue on:

Terraform v1.3.1
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v4.32.0

tsibley avatar Sep 30 '22 21:09 tsibley

I have the same problem with:

Terraform v1.4.0
on darwin_arm64
+ provider registry.terraform.io/hashicorp/aws v4.57.1

When state is stored in Azure storage account it works.

mattew avatar Mar 09 '23 06:03 mattew