terraform
terraform copied to clipboard
Deprecate s3 remote backend lock table with new strong consistency
Current Terraform Version
0.13.5
Proposal
Terraform currently requires the usage of a DynamoDB table for state file locking due to consistency requirements that have previously been impossible in S3 directly as it has eventual consistency on read after write. The GCS backend, however, is able to avoid this extra mechanism and just write the lock file to the GCS bucket as GCS provides strong read after write consistency.
AWS have announced that S3 will now support strong read after write consistency: https://aws.amazon.com/blogs/aws/amazon-s3-update-strong-read-after-write-consistency/
I think this means that the DynamoDB lock table is now unnecessary and the same approach as the GCS backend can be used here to write the lock file directly to S3.
This should simplify the usage of Terraform's S3 backend as it removes an extra component.
I'd propose aligning the S3 backend with the GCS backend so it uses the lock file in S3 and marks the dynamodb_table
argument as deprecated in 0.14 releases. I'd then propose removing the dynamodb_table
argument altogether in 0.15/1.0.
References
- https://aws.amazon.com/blogs/aws/amazon-s3-update-strong-read-after-write-consistency/
- https://www.terraform.io/docs/backends/types/s3.html
- https://www.terraform.io/docs/backends/types/gcs.html
If there's interest in this I might have some time to contribute the change on Friday.
@tomelliff Thanks for the proposal!
The AWS team here at HashiCorp maintain our S3 remote state backend. They're probably a little busy with re:invent going at the moment so I'm not sure they'll have a chance to review and weigh this by Friday, but I'll bring it to their attention.
I'm curious if this could make it into 0.15.0 :-)
Related:
We’ve not yet done any research to see if S3’s new guarantees are sufficient for that model to be safe, and I also want to be transparent that this isn’t an area we’re actively working on: the existing S3/DynamoDB solution is working and meeting the use-case, so our efforts are naturally focused on other situations that are not yet solved.
https://discuss.hashicorp.com/t/feature-request-terraform-state-locking-in-aws-with-s3-strong-consistency-no-dynamodb/18456/8
@pkolyvas Any timeline updates here?
Correct me if wrong, but shouldn't this S3-strong-consistency tfstate locking improvement be done in Terraform core itself as opposed to the AWS provider codebase? I can see in GCS state locking code this snippet
const (
stateFileSuffix = ".tfstate"
lockFileSuffix = ".tflock"
)
So in S3 state locking code, is where the DynamoDB table is looked up to.
// get a remote client configured for this state
func (b *Backend) remoteClient(name string) (*RemoteClient, error) {
if name == "" {
return nil, errors.New("missing state name")
}
client := &RemoteClient{
s3Client: b.s3Client,
dynClient: b.dynClient,
bucketName: b.bucketName,
path: b.path(name),
serverSideEncryption: b.serverSideEncryption,
customerEncryptionKey: b.customerEncryptionKey,
acl: b.acl,
kmsKeyID: b.kmsKeyID,
ddbTable: b.ddbTable,
}
return client, nil
}
Reference:
Object Lock Store objects using a write-once-read-many (WORM) model to help you prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely.
@dcloud9 this is proposed for Terraform core because that's where all the remote backends live right now. What made you think it was a suggestion for the AWS provider?
@tomelliff Nothing, just asking for confirmation :point_up: . Thanks for confirming that there's an AWS team within Terraform core or maybe it's a shared resource for AWS provider as well.
Assuming this issue is acknowledged, what does this mean for people using s3 compatible storage like minio?
Assuming this issue is acknowledged, what does this mean for people using s3 compatible storage like minio?
How does that currently get used with the lock table? Still using DynamoDB? Using DynamoDB Local?
Also, this looks like it would just work with MinIO anyway as it already has strict read after write consistency on non NFS filesystems:
Consistency Guarantees MinIO follows strict read-after-write and list-after-write consistency model for all i/o operations both in distributed and standalone modes. This consistency model is only guaranteed if you use disk filesystems such as xfs, ext4 or zfs etc.. for distributed setup.
If MinIO distributed setup is using NFS volumes underneath it is not guaranteed MinIO will provide these consistency guarantees since NFS is not consistent filesystem by design (If you must use NFS we recommend that you atleast use NFSv4 instead of NFSv3).
Is this still happening?
@tomelliff @pkolyvas any update on this? Thanks!
pulumi implements s3 lock files, dynamodb locks are superfluous
Hello,
Are you sure that strong consistency allows achieving locking? As far as I understand, this strong consistency doesn’t prevent to concurrent processes to write a lock file simultaneously and latest wins. So there’s still race condition where two concurrent processes think they’ve locked the state, isn’t it?
Reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html#ConsistencyModel
Amazon S3 does not support object locking for concurrent writers. If two PUT requests are simultaneously made to the same key, the request with the latest timestamp wins. If this is an issue, you must build an object-locking mechanism into your application.
EDIT: I forgot to mention that alternatives exist: https://en.wikipedia.org/wiki/Mutual_exclusion#Software_solutions. I’m not an expert of concurrent algorithms, but I think it could work now that strong consistency is available.
See also: https://github.com/hashicorp/terraform/pull/31454
It's been a couple of years since this ticket was opened, is this happening? Would love to remove a dependency on Dynamo DB.
Hello all. According to the AWS documentation on concurrency in the S3 data consistency model (https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html#ConsistencyModel), we are not able to rely on the read-after-write consistency for locking.
The documentation notes that
Amazon S3 does not support object locking for concurrent writers. If two PUT requests are simultaneously made to the same key, the request with the latest timestamp wins. If this is an issue, you must build an object-locking mechanism into your application.
We will reconsider this feature if the S3 consistency guarantees change in the future.
This makes sense. Thank you for commenting.
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.