terraform-provider-aws icon indicating copy to clipboard operation
terraform-provider-aws copied to clipboard

[Bug]: aws_elasticsearch_domain_policy Error: setting Elasticsearch Domain Policy (): ValidationException: A change/update is in progress. Please wait for it to complete before requesting another change

Open scott-doyland-burrows opened this issue 1 year ago • 3 comments

Terraform Core Version

1.7.3

AWS Provider Version

5.29.0

Affected Resource(s)

aws_elasticsearch_domain is created and then triggers aws_elasticsearch_domain, but the second resource fails with:

│ Error: setting Elasticsearch Domain Policy (): ValidationException: A change/update is in progress. Please wait for it to complete before requesting another change.
│ 
│   with module.terraform-module-environment.aws_elasticsearch_domain_policy.elastic_search["inf01"],
│   on modules/main/elastic-search.tf line 132, in resource "aws_elasticsearch_domain_policy" "elastic_search":
│  132: resource "aws_elasticsearch_domain_policy" "elastic_search" {

Looks like the first resource is not fully up when the second resource runs.

This used to work, and only stopped working a few weeks ago.

Expected Behavior

apply should work.

The provider should wait for the first resource to be fully up before moving onto the second resource.

Actual Behavior

apply fails, and a second apply works.

Relevant Error/Panic Output Snippet

│ Error: setting Elasticsearch Domain Policy (): ValidationException: A change/update is in progress. Please wait for it to complete before requesting another change.
│ 
│   with module.terraform-module-environment.aws_elasticsearch_domain_policy.elastic_search["inf01"],
│   on modules/main/elastic-search.tf line 132, in resource "aws_elasticsearch_domain_policy" "elastic_search":
│  132: resource "aws_elasticsearch_domain_policy" "elastic_search" {
│

Terraform Configuration Files

resource "aws_elasticsearch_domain" "elastic_search" {
  provider = aws.ou

  for_each = local.applications_elastic_search

  domain_name           = "${local.name}-${each.key}"
  elasticsearch_version = each.value.version

  cluster_config {
    instance_type            = each.value.instance_type
    zone_awareness_enabled   = each.value.zone_awareness_enabled // set to true to enable multi AZ.  set to false to set 1xAZ
    instance_count           = each.value.instance_count         // must be multiples of the AZ count, eg 3xAZ means 3/6/9 instances
    dedicated_master_enabled = false

    dynamic "zone_awareness_config" {
      for_each = each.value.zone_awareness_enabled == true ? [0] : []
      content {
        availability_zone_count = each.value.availability_zone_count // must be 2 or 3.  Ignored if zone_awareness_enabled is set to false
      }
    }
  }

  vpc_options {
    subnet_ids = slice(tolist(module.subnets.private_subnet_ids), 0, each.value.zone_awareness_enabled == false ? 1 : each.value.availability_zone_count)

    security_group_ids = [
      aws_security_group.elastic_search_domain[0].id
    ]
  }

  advanced_security_options {
    enabled                        = true
    internal_user_database_enabled = true

    master_user_options {
      master_user_name     = "admin"
      master_user_password = random_password.elastic_search[each.key].result
    }
  }

  encrypt_at_rest {
    enabled = true
  }

  node_to_node_encryption {
    enabled = true
  }

  domain_endpoint_options {
    enforce_https       = true
    tls_security_policy = "Policy-Min-TLS-1-2-2019-07"
  }

  ebs_options {
    ebs_enabled = true
    volume_size = each.value.volume_size
    volume_type = each.value.volume_type
  }

  dynamic "log_publishing_options" {
    for_each = local.elastic_search_logs
    content {
      cloudwatch_log_group_arn = aws_cloudwatch_log_group.elastic_search[log_publishing_options.key].arn
      log_type                 = log_publishing_options.value.log
    }
  }
}
resource "aws_elasticsearch_domain_policy" "elastic_search" {
  provider = aws.ou

  depends_on = [time_sleep.wait_60_seconds_for_elastic_search]

  for_each = local.applications_elastic_search

  domain_name     = aws_elasticsearch_domain.elastic_search[each.key].domain_name
  access_policies = data.aws_iam_policy_document.elasticsearch[each.key].json
}

Steps to Reproduce

Above code causes the issue, but is quite specific to my environment. The code could be simplified, but we would need to ensure the first resource still took too long to initialize in AWS (ie a simpler version of the resource may initialize quicker).

Debug Output

No response

Panic Output

No response

Important Factoids

If I implement a delay of 60 seconds between the two resources (using time_sleep resource), then it works.

References

No response

Would you like to implement a fix?

None

scott-doyland-burrows avatar Feb 14 '24 15:02 scott-doyland-burrows

Community Note

Voting for Prioritization

  • Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
  • Please see our prioritization guide for information on how we prioritize.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

  • If you are interested in working on this issue, please leave a comment.
  • If this would be your first contribution, please review the contribution guide.

github-actions[bot] avatar Feb 14 '24 15:02 github-actions[bot]

We are also facing the same issue. We tried adding the implicit policy, creating the policy as a separate resource and using depends_on to point to the ES domain resource, but no luck. It randomly works and fails.

Karthikraja9 avatar Feb 20 '24 11:02 Karthikraja9

Our organization is also facing this issue although we are using OpenSearch and the accompanying aws_opensearch_domain_policy . We are using OpenSearch_2.7, Terraform v1.7, AWS provider 5.34.

jabkes avatar Feb 21 '24 16:02 jabkes

I'm also facing this issue. I feel like it'll be possible to resolve to check the domain status before updating the policy. https://github.com/hashicorp/terraform-provider-aws/blob/b9d9303a4d355bd1cc93e7b5e1e708bd571e2799/internal/service/opensearch/domain_policy.go#L93-L96

rariyama avatar Feb 25 '24 08:02 rariyama

We are facing the same issue using Terraform v1.7.4, AWS provider 5.37.0 on OpenSearch_2.11.

novotnymiro avatar Feb 27 '24 09:02 novotnymiro

I am also facing the same issue with Terraform v1.7.1, AWS provider 5.34.0 on OpenSearch_2.11

vani0123 avatar Feb 27 '24 10:02 vani0123

The issue was not occurring with AWS provider 5.38.0. But again it started showing same error with 5.38.0 version.

vani0123 avatar Feb 28 '24 04:02 vani0123

I'm still hitting this issue on AWS provider 5.38.0

mgrov-ksc avatar Mar 04 '24 22:03 mgrov-ksc

i was facing a similar issue while creating opensearch .

i added a sleep condition between aws_opensearch_domain and aws_opensearch_domain_policy. The time_sleep was put dependent on aws_opensearch_domain and aws_opensearch_domain_policy dependent on time_sleep.

The timer was set for 10 mins which i picked randomly but it might work for shorter sleep cycle as well .

Below is the example:

resource "aws_opensearch_domain" "domain" {
------config-------
}


resource "time_sleep" "delay_10_min" {
  create_duration = "10m"

  depends_on = [ aws_opensearch_domain.domain ]
}

resource "aws_opensearch_domain_policy" "main" {

depends_on = [time_sleep.delay_10_min]
}

Hope this helps

aksh-sood avatar Mar 21 '24 18:03 aksh-sood

i was facing a similar issue while creating opensearch .

i added a sleep condition between aws_opensearch_domain and aws_opensearch_domain_policy. The time_sleep was put dependent on aws_opensearch_domain and aws_opensearch_domain_policy dependent on time_sleep.

The timer was set for 10 mins which i picked randomly but it might work for shorter sleep cycle as well .

Below is the example:

resource "aws_opensearch_domain" "domain" {
------config-------
}


resource "time_sleep" "delay_10_min" {
  create_duration = "10m"

  depends_on = [ aws_opensearch_domain.domain ]
}

resource "aws_opensearch_domain_policy" "main" {

depends_on = [time_sleep.delay_10_min]
}

Hope this helps

Yes - that's exactly what I did as per the first post in this issue.

scott-doyland-burrows avatar Mar 24 '24 12:03 scott-doyland-burrows

[!WARNING] This issue has been closed, meaning that any additional comments are hard for our team to see. Please assume that the maintainers will not see them.

Ongoing conversations amongst community members are welcome, however, the issue will be locked after 30 days. Moving conversations to another venue, such as the AWS Provider forum, is recommended. If you have additional concerns, please open a new issue, referencing this one where needed.

github-actions[bot] avatar Mar 26 '24 19:03 github-actions[bot]

This functionality has been released in v5.43.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

github-actions[bot] avatar Mar 28 '24 21:03 github-actions[bot]

I'm still seeing this error on v5.43.0 with aws_elasticsearch_domain_policy. Was this fixed for aws_elasticsearch_domain_policy as well or just aws_opensearch_domain_policy?

msmith93 avatar Mar 28 '24 23:03 msmith93

This fix was applied to both the elasticsearch and opensearch resources. Can you open a new issue with the configuration you're observing this with?

jar-b avatar Mar 29 '24 13:03 jar-b

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

github-actions[bot] avatar Apr 29 '24 02:04 github-actions[bot]