terraform-provider-dynatrace icon indicating copy to clipboard operation
terraform-provider-dynatrace copied to clipboard

dynatrace_slo_v2 created with count() get arbitrarily reordered

Open mattBaumBeneva opened this issue 6 months ago • 8 comments

Describe the bug Our code configures an SLO measure for each Mobile Application KUA having and Apdex configured, like so:

resource "dynatrace_slo_v2" "kua_apdex_slo" {
  count = length(dynatrace_mobile_app_key_performance.kua_adpex)

  name               = "${local.app_name}_${dynatrace_mobile_app_key_performance.kua_adpex[count.index].scope}_slo"
  enabled            = true
  custom_description = "SLO Mobile - App: \"${local.app_name}\" - KUA: \"${substr(local.kua_with_apdex[count.index].nom, 0, 48)}\""
  evaluation_type    = "AGGREGATE"
  evaluation_window  = "-10m"
  filter             = "type(\"DEVICE_APPLICATION_METHOD\"),entityId(\"${dynatrace_mobile_app_key_performance.kua_adpex[count.index].scope}\")"
  metric_expression  = "100*builtin:apps.other.keyUserActions.apdexValue.os:splitBy(\"dt.entity.device_application_method\")"
  target_success     = 70
  target_warning     = 85
  error_budget_burn_rate {
    burn_rate_visualization_enabled = false
  }
}

The provider takes these arbitrarily ordered elements and updates/modifies every single SLO whose configuration has not been modified. For example, here is a case where the ordering has changed and the Provider attempts to rename an element which has not really changed:

14:13:51    # module.Beneva-dev-staging.dynatrace_slo_v2.kua_apdex_slo[21] will be updated in-place
14:13:51    ~ resource "dynatrace_slo_v2" "kua_apdex_slo" {
14:13:51        ~ custom_description = "SLO Mobile - App: \"Beneva-dev-staging\" - KUA: \"🔑 [Load App Manual]\"" -> "SLO Mobile - App: \"Beneva-dev-staging\" - KUA: \"🔑 [savings.navigate_to_customer_portal]\""
14:13:51        ~ filter             = "type(\"DEVICE_APPLICATION_METHOD\"),entityId(\"DEVICE_APPLICATION_METHOD-CF935488A1D21E1F\")" -> "type(\"DEVICE_APPLICATION_METHOD\"),entityId(\"DEVICE_APPLICATION_METHOD-C16A143B9D317F24\")"
14:13:51          id                 = "vu9U3hXa3q0AAAABABZidWlsdGluOm1vbml0b3Jpbmcuc2xvAAZ0ZW5hbnQABnRlbmFudAAkZTJjYTQ4NjItNzU2Ny0zMmRhLThlZTgtZjVkZjQ1ZjY1NmRmvu9U3hXa3q0"
14:13:51        ~ name               = "Beneva-dev-staging_DEVICE_APPLICATION_METHOD-CF935488A1D21E1F_slo" -> "Beneva-dev-staging_DEVICE_APPLICATION_METHOD-C16A143B9D317F24_slo"
14:13:51          # (8 unchanged attributes hidden)
14:13:51  
14:13:51          # (1 unchanged block hidden)
14:13:51      }

This kind of erroneous renaming causes errors, because the SLO with that same name already exists (another SLO is being given the same name):

14:14:37  │ Error: There is another SLO with an identical name defined. Please make your SLO name unique to save your changes.
14:14:37  │ There is another identical SLO defined. Please make your SLO unique to save your changes.
14:14:37  │ There is another identical SLO defined. Please make your SLO unique to save your changes.
14:14:37  │ There is another identical SLO defined. Please make your SLO unique to save your changes.
14:14:37  │ There is another identical SLO defined. Please make your SLO unique to save your changes.
14:14:37  │ There is another identical SLO defined. Please make your SLO unique to save your changes.
14:14:37  │ There is another identical SLO defined. Please make your SLO unique to save your changes.
14:14:37  │ There is another identical SLO defined. Please make your SLO unique to save your changes.
14:14:37  │ There is another identical SLO defined. Please make your SLO unique to save your changes.

Expected behavior If an SLO with a given name (which must be unique) exists in Dynatrace, the provider should not attempt to arbitrarily rename another SLO in this way.

Additional context Using the latest provider version.

mattBaumBeneva avatar Dec 13 '23 19:12 mattBaumBeneva

This issue is preventing us from putting important changes into production.

mattBaumBeneva avatar Dec 13 '23 19:12 mattBaumBeneva

Hello @mattBaumBeneva,

What defines the number of instances of dynatrace_mobile_app_key_performance? Is it something like this?

resource "dynatrace_mobile_app_key_performance" "kua_adpex" {
  count = length(data.dynatrace_entities.application_methods.entities)
  ...
}

.. in other words, could it be that the order / number of these application methods has changed since the last terraform apply?

Also, just as a test, what happens when you integrate the index into the names of the SLOs?

resource "dynatrace_slo_v2" "kua_apdex_slo" {
  count = length(dynatrace_mobile_app_key_performance.kua_adpex)
  name  = "${local.app_name}_${dynatrace_mobile_app_key_performance.kua_adpex[count.index].scope}_${count.index}_slo"
  ...
}

Are you getting away without error messages? I'm not saying, that this should be the solution, I just want to make sure we're understanding the root cause of the problem.

My suspicion is, that, because the dynatrace_entities produces a different list (not necessarily ordered differently, but just with additional or fewer entries). And in that case, when dynatrace_slo_v2 iterates over e.g. the third entry (which was previously the fourth), Terraform assumes, that the name of the third entry has changes (triggers an update) while the SLO that previously was the third entry still exists in Dynatrace (that one would get renamed also later on).

If I'm right with all that, what I would suggest is to not iterate directly over dynatrace_mobile_app_key_performance.kua_adpex but build a map like this:

locals {
  kua_adpex_map = { 
    for kua_adpex in dynatrace_mobile_app_key_performance.kua_adpex : 
      kua_adpex.scope => kua_adpex
  }
}

That allows you to change your dynatrace_slo_v2 resource to use the for_each keyword, similar to this:

resource "dynatrace_slo_v2" "kua_apdex_slo" {
  for_each= local.kua_adpex_map
  name  = "${local.app_name}_${each.value.scope}_slo"
  ...
}

Dynatrace-Reinhard-Pilz avatar Dec 14 '23 13:12 Dynatrace-Reinhard-Pilz

Ultimately, the change that triggered the problem is that the KUAs are generated from a list of objects, and we added two elements to the end of the list. So, yes, the number of elements has changed since the last apply, but not the order.

You are also correct we have to use dynatrace_entities like so, because there is not dedicated resource (to my knowledge) to configure KUAs on Mobile Apps like there is on Web Apps (where we use for_each everywhere):

data "dynatrace_entity" "kua_with_apdex" {
  count           = length(local.kua_with_apdex)
  entity_selector = "type(\"DEVICE_APPLICATION_METHOD\"),entityName(\"${local.kua_with_apdex[count.index].nom}\"),fromRelationships.isDeviceApplicationMethodOf(entityId(\"${dynatrace_mobile_application.app_mobile.id}\"))"
}

mattBaumBeneva avatar Dec 14 '23 13:12 mattBaumBeneva

I will try multiple tests to see if anything passes.

mattBaumBeneva avatar Dec 14 '23 13:12 mattBaumBeneva

@Dynatrace-Reinhard-Pilz , I was able to apply all of our updates by refactoring our module to use only for_each with KUA names as keys. Thanks.

I have one final thing to report. This happened only once, but I got the following error. Applying again immediately afterward passed. I'm reporting the error because Terraform indicates it represents a bug:

10:06:42  │ Error: Provider produced inconsistent result after apply
10:06:42  │ 
10:06:42  │ When applying changes to
10:06:42  │ module.Beneva-lab.dynatrace_mobile_app_key_performance.kua_adpex["🔑
10:06:42  │ [savings.navigate_to_customer_portal]"], provider
10:06:42  │ "provider[\"registry.terraform.io/dynatrace-oss/dynatrace\"]" produced an
10:06:42  │ unexpected new value: Root object was present, but now absent.
10:06:42  │ 
10:06:42  │ This is a bug in the provider, which should be reported in the provider's
10:06:42  │ own issue tracker.

mattBaumBeneva avatar Dec 14 '23 15:12 mattBaumBeneva

The error message tells me that - at least temporarily - the provider BELIEVED that dynatrace_mobile_app_key_performance.kua_adpex["[savings.navigate_to_customer_portal]"] didn't exist on the target environment.

And, yes, we need to take that seriously, even if it "auto fixes" with the next apply.

May I ask for a favor at this point? I believe I know ALMOST how your module is structured, but would you be able to send us a sample that contains everything you're using here? I'd like to match as close as possible what you're applying here against my personal environment. Perhaps I can trigger the error and investigate closer. Feel free to send to [email protected]. I would know to get rid of the error message with a blunt approach, but that would be a last resort - it would certainly NOT address the actual bug.

Dynatrace-Reinhard-Pilz avatar Dec 14 '23 15:12 Dynatrace-Reinhard-Pilz

I sent you our shared module code. This is code we expose to all our product teams internally. If you also need the code that calls the module, I can provide an anonymized example file with the same structure (see my email).

mattBaumBeneva avatar Dec 14 '23 16:12 mattBaumBeneva

Thanks a lot! I will keep this issue open, until I've been able to make sense of the error message you've seen.

Dynatrace-Reinhard-Pilz avatar Dec 15 '23 10:12 Dynatrace-Reinhard-Pilz