icinga2 icon indicating copy to clipboard operation
icinga2 copied to clipboard

[bug?] += causes obscure issues when working with more complex variables in config parser

Open yoshi314 opened this issue 2 years ago • 3 comments

I have a stanza in my generic service template

if (host.vars.all_services) {
    vars += host.vars.all_services

which i use to override most common variables for all services on a given host - usually for notifiication contacts, timeperiods, etc.

I was noticing odd behaviour recently, with certain variables not being in line with what i set them to.

Upon further inspection i found that the above line causes issues for me. Apparently against my expectations it does not add my variables to existing structure, but rather wipes out most of it.

E.g i have something like this in my service definition

  vars.curl_checks["pr01_check"] = {
    jira_notification_config = {
      jira_template="PR01_support"
      users = [ "jira_PR01" ]
      jira_project = "PROJ01"
    }
    notification.jira.times = {
      begin = 10m
    }
    notification.sms.users = [ "cmb-sms" , "dyzur_20_7" ]
    notification.sms.period = "24x7_bez_7_20"
    notification.users += [ "someuser" ]
    notification.sms.testvalue = "this variable exists"
(...)
  }

If the host i applied the service to has

  vars.all_services.notification.mail.extra_groups += [ "support-PR01" ]

it wipes out most of those variables in notification.* and goes with defaults for given service or host. In my case the notification.sms.* vars are completely wiped out.

Commenting out that line in service template fixes my problems.

Is this a bug or my misunderstanding of handling more complex config objects?

yoshi314 avatar Nov 20 '23 12:11 yoshi314

Which version are you using?

Also, in the middle code snippet, you are setting notification.sms.users within vars.curl_checks["pr01_check"], is that intended?

I think it would help if you'd also share more of the object definitions so that it's clear what's within an object/template/apply rule and what is imported where.

Commenting out that line in service template fixes my problems.

Having a look at the output of the following command for the affected service with and without the line commented out could also give some clues:

icinga2 object list -t Service -n 'affected-host-name!affected-service-name'

julianbrost avatar Nov 20 '23 13:11 julianbrost

I am running icinga2 2.14 , on Debian. Packages are from icinga's deb repository.

this is what is produced when i had that line enabled:

{
  "curl_credentials": "configurator:sekrit",
  "curl_critical": 7200,
  "curl_hostname": "server..com",
  "curl_port": 8080,
  "curl_request": "GET",
  "curl_url": "management/queue/monitoring/secondsSinceTheOldestMessageInQueue/ACTIVE",
  "curl_warning": 3600,
  "display_name": "Oldest message in w ACTIVE",
  "jira_notification_config": {
    "jira_project": "MU",
    "jira_template": "cmb_utrzymanie",
    "times": {
      "begin": 600
    },
    "users": [
      "jira_cmb"
    ]
  },
  "notification": {
    "jira": {
      "times": {
        "begin": 3600
      }
    },
    "mail": {
      "extra_groups": [
        "utrzymanie-cmb"
      ],
      "interval": 86400,
      "period": "workhours"
    }
  }
}

I only had the older dump that i formatted via jq.

And the vars section when it is disabled

  * vars
    % = modified in '/etc/icinga2/zones.d/global/service/check_curl_value.conf', lines 4:2-4:15
    * curl_credentials = "configurator:sekrit"
    * curl_critical = 7200
    * curl_hostname = "server..com"
    * curl_port = 8080
    * curl_request = "GET"
    * curl_url = "management/queue/monitoring/secondsSinceTheOldestMessageInQueue/ACTIVE"
    * curl_warning = 3600
    * display_name = "Oldest message in ACTIVE"
    * notification
      * sms
        * users = [ "cmb-sms", "dyzur_20_7" ]
      * users = [ "grczyz" ]

Also, in the middle code snippet, you are setting notification.sms.users within vars.curl_checks["pr01_check"], is that intended?

I think i was trying to narrow down what was going on, that was superfluous.

The relevant config is here

  vars.curl_checks["queue-oldest-msg-ACTIVE"] = {
    display_name = "Oldest message in ACTIVE"
    curl_credentials = "configurator:sekrit"
    curl_hostname = "server.com"
    curl_url = "management/queue/monitoring/secondsSinceTheOldestMessageInQueue/ACTIVE"
    curl_port = 8080
    curl_request = "GET"
    curl_critical = 7200
    curl_warning = 3600
    notification.users += [ "grczyz" ]
    notification.sms.users = [ "cmb-sms" , "dyzur_20_7" ]
  }

This is constructed via generic for loop

apply Service "curl_check" for (check_name => config in host.vars.curl_checks) {
        check_command = "check-curl-value"

        vars += config
        import "fix-display-name"  // this maps vars.display_name to display_name
        import "generic-service"
        assign where host.vars.curl_checks
}

And the imported template:

template Service "generic-service" {
  max_check_attempts = 5
  check_interval = 5m
  retry_interval = 90s
  check_period = "24x7"

// this causes problems, destroying existing service.vars.notification.* 
  //if (host.vars.all_services) {
  //  vars += host.vars.all_services
  //}

// import global host defaults, i added this as a workaround. previously i searched through vars imported in above section
  if (host.vars.all_services.notification.period) {
    check_period = host.vars.all_services.notification.period
  }
  if (host.vars.all_services.notification.interval) {
    check_interval = host.vars.all_services.notification.interval
  }
  if (host.vars.all_services.notification.check_attempts) {
    max_check_attemps = host.vars.all_services.notification.check_attempts
  }

  if (service.vars.check_period) { 
    check_period = service.vars.check_period
  }

  if (service.vars.check_interval) { 
    check_interval = service.vars.check_interval
  }

  if (service.vars.max_check_attempts) { 
    max_check_attempts = service.vars.max_check_attempts
  }
}

yoshi314 avatar Nov 22 '23 14:11 yoshi314

vars += host.vars.all_services does something quite different.

To me it looks like you expect vars.all_services.notification.mail.extra_groups += [ "support-PR01" ] to be the same as

vars += {
  all_services = {
    notification = {
      mail = {
        extra_groups = [ "support-PR01" ]
      }
    }
  }
}

But the latter is more like this:

vars.all_services = {
  notification = {
    mail = {
      extra_groups = [ "support-PR01" ]
    }
  }
}

That is, += does not attempt to deep-merge any data structures. So with base and extra being dicts, base += extra takes all keys and values contained directly in extra and sets them in base (where set means replacing the value at a key if that was set before).

julianbrost avatar Nov 22 '23 14:11 julianbrost