terraform icon indicating copy to clipboard operation
terraform copied to clipboard

Regression for certain third-party S3 backends over S3 v2 API as of 1.11.2 (ceph, hetzner)

Open freznicek opened this issue 7 months ago • 10 comments

Terraform Version

1.11.2 and later

Terraform Configuration Files

Showing here just the Terraform S3 backend configuration.

 terraform { 
  backend "s3" {
    endpoints =                 { s3 = "<url>"}
    shared_credentials_files    = ["./.tf-s3-creds"]
    bucket                      = "$CONTAINER_NAME"
    use_path_style              = true
    key                         = "terraform.tfstate"
    workspace_key_prefix        = "<ostack-container-name>"
    region                      = "<ostack-region>"
    skip_credentials_validation = true
    skip_region_validation      = true
    skip_requesting_account_id  = true
    skip_metadata_api_check     = true
    skip_s3_checksum            = true
  }
} 

Debug Output

│ Error: failed to upload state: operation error S3: PutObject, https response error StatusCode: 400, RequestID: tx00000a93590afd2885bfb-0068232994-e806fab1-cloud-ceph-objectstore-prod-brno, HostID: e806fab1-cloud-ceph-objectstore-prod-brno-cloud-ceph-objectstore-prod-brno, api error XAmzContentSHA256Mismatch: UnknownError

Expected Behavior

I understand S3 v2 API is not the recent one and maybe should not be enabled by default.

On the other side I believe this behavior is regression and there has to be decision from Terraform community whether to support S3 v2 API. I'd propose to additional parameter to S3 backend enforcing S3 v2 API.

Actual Behavior

Unable to push terraform state to S3 backend over S3 v2 API. Fails with:

│ Error: failed to upload state: operation error S3: PutObject, https response error StatusCode: 400, RequestID: tx00000a93590afd2885bfb-0068232994-e806fab1-cloud-ceph-objectstore-prod-brno, HostID: e806fab1-cloud-ceph-objectstore-prod-brno-cloud-ceph-objectstore-prod-brno, api error XAmzContentSHA256Mismatch: UnknownError

Steps to Reproduce

  1. Let's have S3 backend available based ceph rados-gateway
  2. Terraform infrastructure uses s3 backend
  3. Classical terraform workflow start to fail (since Terraform 1.11.2)
  4. terraform init
  5. terraform validate
  6. terraform plan -out plan
  7. terraform apply plan

Additional Context

On prem clouds with ceph distributed storage backends still use ceph rados gateway for Swift / S3 object-store offerings.

  • https://docs.ceph.com/en/latest/radosgw/
  • https://docs.ceph.com/en/quincy/radosgw/s3/

References

This issue is related to https://github.com/hashicorp/terraform/pull/36625 change.

Generative AI / LLM assisted development?

No response

freznicek avatar May 20 '25 09:05 freznicek

Thanks for this report! The AWS provider team at HashiCorp, codeowner for this functionality, has been notified and will triage on their timeline. Thanks!

crw avatar May 21 '25 22:05 crw

I'm having the same issue when migrating from local tfstate to Hetzner object storage. The usecase here is to also manage the created object with IaC.

So the code looks as follows:

backend "s3" {
    bucket = "states-object"
    endpoints = {
      s3 = "https://nbg1.your-objectstorage.com"
    }
    key                         = "terraform.tfstate"
    region                      = "main"
    skip_credentials_validation = true
    skip_metadata_api_check     = true
    skip_region_validation      = true
    skip_requesting_account_id  = true
    use_path_style              = true
    skip_s3_checksum            = true
  }

Note that in my usecase, I'm providing the Hetzner S3 credentials via env vars: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.

adriantunez avatar May 29 '25 06:05 adriantunez

Just downgrade to V1.11.1 for now to fix this problem. Regression testing ftw ;-)

drew-viles avatar May 29 '25 12:05 drew-viles

To be clear, it looks like issues are appearing in third-party implementations of S3 (ceph, hetzner), which historical experience tells us have not always kept up with the AWS SDKs. In the past, the AWS provider team at HashiCorp has been able to debug and fix some of these issues, but does not guarantee compatibility with third-party (not-AWS) vendors. You can report the incompatibility with the vendors as well. Thanks!

crw avatar May 29 '25 19:05 crw

Thanks for the clarification. I was only saying "regression testing ftw" in jest anyway, hence my winky face 😆 . I get there are a lot of moving parts with something like this!

drew-viles avatar May 29 '25 19:05 drew-viles

Understood! Just wanted to make clear that this seems to be scoped to non-AWS S3 backends. In fact, I will update the issue title. Thanks!

crw avatar May 29 '25 22:05 crw

hi team, same issue encountered while upgrading terraform from v1.10.5 to v1.11.4 but my specific 3rd party s3 backend is IBMCloud Object Storage. Wanted to kindly ask that if this issue comes from this implementation: https://github.com/hashicorp/terraform/issues/36113 is there an ETA for the remediation? For now, 1.10.x works, but we would benefit from using an actively patched version, thanks!

FernandaDguez avatar Jul 14 '25 22:07 FernandaDguez

As suggested from https://github.com/hashicorp/terraform/issues/37130#issuecomment-2919186830 Downgrade to V1.11.1 works for me.

zdk avatar Sep 02 '25 09:09 zdk

I can confirm that v1.11.1 also works with a ceph backend. But it would be preferable if we can upgrade to a version that is not end of life.

swerveshot avatar Sep 11 '25 09:09 swerveshot

Also confirming 1.11.1 works

bdeetz avatar Nov 04 '25 01:11 bdeetz