terraform icon indicating copy to clipboard operation
terraform copied to clipboard

Deprecate s3 remote backend lock table with new strong consistency

Open tomelliff opened this issue 4 years ago • 14 comments

Current Terraform Version

0.13.5

Proposal

Terraform currently requires the usage of a DynamoDB table for state file locking due to consistency requirements that have previously been impossible in S3 directly as it has eventual consistency on read after write. The GCS backend, however, is able to avoid this extra mechanism and just write the lock file to the GCS bucket as GCS provides strong read after write consistency.

AWS have announced that S3 will now support strong read after write consistency: https://aws.amazon.com/blogs/aws/amazon-s3-update-strong-read-after-write-consistency/

I think this means that the DynamoDB lock table is now unnecessary and the same approach as the GCS backend can be used here to write the lock file directly to S3.

This should simplify the usage of Terraform's S3 backend as it removes an extra component.

I'd propose aligning the S3 backend with the GCS backend so it uses the lock file in S3 and marks the dynamodb_table argument as deprecated in 0.14 releases. I'd then propose removing the dynamodb_table argument altogether in 0.15/1.0.

References

  • https://aws.amazon.com/blogs/aws/amazon-s3-update-strong-read-after-write-consistency/
  • https://www.terraform.io/docs/backends/types/s3.html
  • https://www.terraform.io/docs/backends/types/gcs.html

tomelliff avatar Dec 02 '20 09:12 tomelliff

If there's interest in this I might have some time to contribute the change on Friday.

tomelliff avatar Dec 02 '20 17:12 tomelliff

@tomelliff Thanks for the proposal!

The AWS team here at HashiCorp maintain our S3 remote state backend. They're probably a little busy with re:invent going at the moment so I'm not sure they'll have a chance to review and weigh this by Friday, but I'll bring it to their attention.

pkolyvas avatar Dec 03 '20 00:12 pkolyvas

I'm curious if this could make it into 0.15.0 :-)

tbugfinder avatar Jan 13 '21 16:01 tbugfinder

Related:

We’ve not yet done any research to see if S3’s new guarantees are sufficient for that model to be safe, and I also want to be transparent that this isn’t an area we’re actively working on: the existing S3/DynamoDB solution is working and meeting the use-case, so our efforts are naturally focused on other situations that are not yet solved.

https://discuss.hashicorp.com/t/feature-request-terraform-state-locking-in-aws-with-s3-strong-consistency-no-dynamodb/18456/8

tdmalone avatar Feb 10 '21 23:02 tdmalone

@pkolyvas Any timeline updates here?

kennedyjustin avatar Jul 14 '21 13:07 kennedyjustin

Correct me if wrong, but shouldn't this S3-strong-consistency tfstate locking improvement be done in Terraform core itself as opposed to the AWS provider codebase? I can see in GCS state locking code this snippet

const (
	stateFileSuffix = ".tfstate"
	lockFileSuffix  = ".tflock"
)

So in S3 state locking code, is where the DynamoDB table is looked up to.

// get a remote client configured for this state
func (b *Backend) remoteClient(name string) (*RemoteClient, error) {
	if name == "" {
		return nil, errors.New("missing state name")
	}

	client := &RemoteClient{
		s3Client:              b.s3Client,
		dynClient:             b.dynClient,
		bucketName:            b.bucketName,
		path:                  b.path(name),
		serverSideEncryption:  b.serverSideEncryption,
		customerEncryptionKey: b.customerEncryptionKey,
		acl:                   b.acl,
		kmsKeyID:              b.kmsKeyID,
		ddbTable:              b.ddbTable,
	}

	return client, nil
}

Reference:

Object Lock Store objects using a write-once-read-many (WORM) model to help you prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely.

dcloud9 avatar Sep 29 '21 09:09 dcloud9

@dcloud9 this is proposed for Terraform core because that's where all the remote backends live right now. What made you think it was a suggestion for the AWS provider?

tomelliff avatar Sep 29 '21 10:09 tomelliff

@tomelliff Nothing, just asking for confirmation :point_up: . Thanks for confirming that there's an AWS team within Terraform core or maybe it's a shared resource for AWS provider as well.

dcloud9 avatar Sep 29 '21 10:09 dcloud9

Assuming this issue is acknowledged, what does this mean for people using s3 compatible storage like minio?

surskitt avatar Sep 29 '21 21:09 surskitt

Assuming this issue is acknowledged, what does this mean for people using s3 compatible storage like minio?

How does that currently get used with the lock table? Still using DynamoDB? Using DynamoDB Local?

Also, this looks like it would just work with MinIO anyway as it already has strict read after write consistency on non NFS filesystems:

Consistency Guarantees MinIO follows strict read-after-write and list-after-write consistency model for all i/o operations both in distributed and standalone modes. This consistency model is only guaranteed if you use disk filesystems such as xfs, ext4 or zfs etc.. for distributed setup.

If MinIO distributed setup is using NFS volumes underneath it is not guaranteed MinIO will provide these consistency guarantees since NFS is not consistent filesystem by design (If you must use NFS we recommend that you atleast use NFSv4 instead of NFSv3).

tomelliff avatar Sep 30 '21 15:09 tomelliff

Is this still happening?

dude0001 avatar Oct 23 '21 18:10 dude0001

@tomelliff @pkolyvas any update on this? Thanks!

binlab avatar Apr 09 '22 23:04 binlab

pulumi implements s3 lock files, dynamodb locks are superfluous

dmccue avatar May 13 '22 20:05 dmccue

Hello,

Are you sure that strong consistency allows achieving locking? As far as I understand, this strong consistency doesn’t prevent to concurrent processes to write a lock file simultaneously and latest wins. So there’s still race condition where two concurrent processes think they’ve locked the state, isn’t it?

Reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html#ConsistencyModel

Amazon S3 does not support object locking for concurrent writers. If two PUT requests are simultaneously made to the same key, the request with the latest timestamp wins. If this is an issue, you must build an object-locking mechanism into your application.

EDIT: I forgot to mention that alternatives exist: https://en.wikipedia.org/wiki/Mutual_exclusion#Software_solutions. I’m not an expert of concurrent algorithms, but I think it could work now that strong consistency is available.

yann-soubeyrand avatar Aug 06 '22 08:08 yann-soubeyrand

See also: https://github.com/hashicorp/terraform/pull/31454

crw avatar Aug 11 '22 21:08 crw

It's been a couple of years since this ticket was opened, is this happening? Would love to remove a dependency on Dynamo DB.

arun-a-nayagam avatar Oct 27 '22 09:10 arun-a-nayagam

Hello all. According to the AWS documentation on concurrency in the S3 data consistency model (https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html#ConsistencyModel), we are not able to rely on the read-after-write consistency for locking.

The documentation notes that

Amazon S3 does not support object locking for concurrent writers. If two PUT requests are simultaneously made to the same key, the request with the latest timestamp wins. If this is an issue, you must build an object-locking mechanism into your application.

We will reconsider this feature if the S3 consistency guarantees change in the future.

gdavison avatar Oct 28 '22 00:10 gdavison

This makes sense. Thank you for commenting.

dude0001 avatar Oct 28 '22 00:10 dude0001

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

github-actions[bot] avatar Nov 27 '22 02:11 github-actions[bot]