resource-agents icon indicating copy to clipboard operation
resource-agents copied to clipboard

ocf:heartbeat:named ERROR: named didn't answer properly for localhost.

Open larrea opened this issue 5 years ago • 1 comments

The named ocf resource uses the host command "/usr/bin/host" which is inefficient as it will try to expand the domains on the domain search path.

for example given the following domain search paths in /etc/resolv.conf

search dev.loc.example.com loc.example.com example.com

The host command would produce the following DNS lookups

23-Sep-2020 15:18:10.287 queries: info: client @0x7f31fc038420 127.0.0.1#48035 (localhost.dev.loc.example.com): query: localhost.dev.loc.example.com IN A + (127.0.0.1) 23-Sep-2020 15:18:10.288 queries: info: client @0x7f31fc038420 127.0.0.1#48700 (localhost.loc.example.com): query: localhost.loc.example.com IN A + (127.0.0.1) 23-Sep-2020 15:18:10.288 queries: info: client @0x7f31fc038420 127.0.0.1#43040 (localhost.example.com): query: localhost.example.com IN A + (127.0.0.1) 23-Sep-2020 15:18:10.288 queries: info: client @0x7f31fc038420 127.0.0.1#37220 (localhost): query: localhost IN A + (127.0.0.1) 23-Sep-2020 15:18:10.288 queries: info: client @0x7f31fc038420 127.0.0.1#38048 (localhost): query: localhost IN AAAA + (127.0.0.1) 23-Sep-2020 15:18:10.289 queries: info: client @0x7f31fc038420 127.0.0.1#59271 (localhost): query: localhost IN MX + (127.0.0.1)

These searches could result in false negatives in a few cases

  1. any domain on the search path returns a SERVFAIL instead of an NXDOMAIN, which can be caused by a zone misconfiguration
  2. the query times out before the "localhost." record is tried
  3. causes unnecessary queries since its only looking for the resource record localhost to return 127.0.0.1

This is the command executed


$ host localhost 127.0.0.1 Using domain server: Name: 127.0.0.1 Address: 127.0.0.1#53 Aliases: localhost has address 127.0.0.1

which can be replaced with:

$ dig @localhost localhost +short 127.0.0.1

for those reasons I submit the following patch for your consideration:

diff --git a/heartbeat/named b/heartbeat/named index 535410df..b14f65a0 100755 --- a/heartbeat/named +++ b/heartbeat/named @@ -18,7 +18,7 @@ #Defaults OCF_RESKEY_named_default="/usr/sbin/named" OCF_RESKEY_rndc_default="/usr/sbin/rndc" -OCF_RESKEY_host_default="/usr/bin/host" +OCF_RESKEY_host_default="/usr/bin/dig" OCF_RESKEY_named_user_default=named OCF_RESKEY_named_config_default="" OCF_RESKEY_named_pidfile_default="/var/run/named/named.pid" @@ -26,7 +26,7 @@ OCF_RESKEY_named_rootdir_default="" OCF_RESKEY_named_options_default="" OCF_RESKEY_named_keytab_file_default="" OCF_RESKEY_rndc_options_default="" -OCF_RESKEY_host_options_default="" +OCF_RESKEY_host_options_default="+short" OCF_RESKEY_monitor_request_default="localhost" OCF_RESKEY_monitor_response_default="127.0.0.1" OCF_RESKEY_monitor_ip_default="127.0.0.1" @@ -328,9 +328,9 @@ named_monitor() { return $OCF_NOT_RUNNING fi

  • output=$OCF_RESKEY_host $OCF_RESKEY_host_options $OCF_RESKEY_monitor_request $OCF_RESKEY_monitor_ip
  • output=$($OCF_RESKEY_host $OCF_RESKEY_host_options $OCF_RESKEY_monitor_request @$OCF_RESKEY_monitor_ip)
  • if [ $? -ne 0 ] || ! echo $output | grep -q '.* has .*address '"$OCF_RESKEY_monitor_response"
  • if [ $? -ne 0 ] || ! echo $output | grep -q "$OCF_RESKEY_monitor_ip" then ocf_exit_reason "named didn't answer properly for $OCF_RESKEY_monitor_request." ocf_log err "Expected: $OCF_RESKEY_monitor_response."

larrea avatar Sep 23 '20 19:09 larrea

I think it'd be a better idea to add an option to use dig to avoid issues for users upgrading who rely on this functionality.

oalbrigt avatar Sep 25 '20 07:09 oalbrigt