resource-agents
resource-agents copied to clipboard
ocf:heartbeat:named ERROR: named didn't answer properly for localhost.
The named ocf resource uses the host command "/usr/bin/host" which is inefficient as it will try to expand the domains on the domain search path.
for example given the following domain search paths in /etc/resolv.conf
search dev.loc.example.com loc.example.com example.com
The host command would produce the following DNS lookups
23-Sep-2020 15:18:10.287 queries: info: client @0x7f31fc038420 127.0.0.1#48035 (localhost.dev.loc.example.com): query: localhost.dev.loc.example.com IN A + (127.0.0.1) 23-Sep-2020 15:18:10.288 queries: info: client @0x7f31fc038420 127.0.0.1#48700 (localhost.loc.example.com): query: localhost.loc.example.com IN A + (127.0.0.1) 23-Sep-2020 15:18:10.288 queries: info: client @0x7f31fc038420 127.0.0.1#43040 (localhost.example.com): query: localhost.example.com IN A + (127.0.0.1) 23-Sep-2020 15:18:10.288 queries: info: client @0x7f31fc038420 127.0.0.1#37220 (localhost): query: localhost IN A + (127.0.0.1) 23-Sep-2020 15:18:10.288 queries: info: client @0x7f31fc038420 127.0.0.1#38048 (localhost): query: localhost IN AAAA + (127.0.0.1) 23-Sep-2020 15:18:10.289 queries: info: client @0x7f31fc038420 127.0.0.1#59271 (localhost): query: localhost IN MX + (127.0.0.1)
These searches could result in false negatives in a few cases
- any domain on the search path returns a SERVFAIL instead of an NXDOMAIN, which can be caused by a zone misconfiguration
- the query times out before the "localhost." record is tried
- causes unnecessary queries since its only looking for the resource record localhost to return 127.0.0.1
This is the command executed
$ host localhost 127.0.0.1 Using domain server: Name: 127.0.0.1 Address: 127.0.0.1#53 Aliases: localhost has address 127.0.0.1
which can be replaced with:
$ dig @localhost localhost +short 127.0.0.1
for those reasons I submit the following patch for your consideration:
diff --git a/heartbeat/named b/heartbeat/named index 535410df..b14f65a0 100755 --- a/heartbeat/named +++ b/heartbeat/named @@ -18,7 +18,7 @@ #Defaults OCF_RESKEY_named_default="/usr/sbin/named" OCF_RESKEY_rndc_default="/usr/sbin/rndc" -OCF_RESKEY_host_default="/usr/bin/host" +OCF_RESKEY_host_default="/usr/bin/dig" OCF_RESKEY_named_user_default=named OCF_RESKEY_named_config_default="" OCF_RESKEY_named_pidfile_default="/var/run/named/named.pid" @@ -26,7 +26,7 @@ OCF_RESKEY_named_rootdir_default="" OCF_RESKEY_named_options_default="" OCF_RESKEY_named_keytab_file_default="" OCF_RESKEY_rndc_options_default="" -OCF_RESKEY_host_options_default="" +OCF_RESKEY_host_options_default="+short" OCF_RESKEY_monitor_request_default="localhost" OCF_RESKEY_monitor_response_default="127.0.0.1" OCF_RESKEY_monitor_ip_default="127.0.0.1" @@ -328,9 +328,9 @@ named_monitor() { return $OCF_NOT_RUNNING fi
- output=
$OCF_RESKEY_host $OCF_RESKEY_host_options $OCF_RESKEY_monitor_request $OCF_RESKEY_monitor_ip
- output=$($OCF_RESKEY_host $OCF_RESKEY_host_options $OCF_RESKEY_monitor_request @$OCF_RESKEY_monitor_ip)
- if [ $? -ne 0 ] || ! echo $output | grep -q '.* has .*address '"$OCF_RESKEY_monitor_response"
- if [ $? -ne 0 ] || ! echo $output | grep -q "$OCF_RESKEY_monitor_ip" then ocf_exit_reason "named didn't answer properly for $OCF_RESKEY_monitor_request." ocf_log err "Expected: $OCF_RESKEY_monitor_response."
I think it'd be a better idea to add an option to use dig to avoid issues for users upgrading who rely on this functionality.