blackbox_exporter
blackbox_exporter copied to clipboard
ip_protocol_fallback when IPv6 target returns icmp6 unreachable
Host operating system: output of uname -a
Linux prometheus 5.4.0-80-generic #90~18.04.1-Ubuntu SMP Tue Jul 13 19:40:02 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
blackbox_exporter version: output of blackbox_exporter --version
blackbox_exporter, version 0.19.0 (branch: HEAD, revision: 5d575b88eb12c65720862e8ad2c5890ba33d1ed0)
build user: root@2b0258d5a55a
build date: 20210510-12:56:44
go version: go1.16.4
platform: linux/amd64
What is the blackbox.yml module config.
modules:
certificate:
prober: tcp
timeout: 5s
tcp:
tls: true
tls_config: {}
What is the prometheus.yml scrape config.
n/a
What logging output did you get from adding &debug=true
to the probe URL?
# time curl -g 'localhost:9115/probe?module=certificate&target=prometheus.example.com:443&debug=true'
Logs for the probe:
ts=2021-08-14T10:57:38.558987285Z caller=main.go:320 module=certificate target=prometheus.example.com:443 level=info msg="Beginning probe" probe=tcp timeout_seconds=5
ts=2021-08-14T10:57:38.559144124Z caller=tcp.go:40 module=certificate target=prometheus.example.com:443 level=info msg="Resolving target address" ip_protocol=ip6
ts=2021-08-14T10:57:38.559387303Z caller=tcp.go:40 module=certificate target=prometheus.example.com:443 level=info msg="Resolved target address" ip=2606:4700:1:1::9876
ts=2021-08-14T10:57:38.559436728Z caller=tcp.go:121 module=certificate target=prometheus.example.com:443 level=info msg="Dialing TCP with TLS"
ts=2021-08-14T10:57:38.566401836Z caller=main.go:130 module=certificate target=prometheus.example.com:443 level=error msg="Error dialing TCP" err="dial tcp6 [2606:4700:1:1::9876]:443: connect: no route to host"
ts=2021-08-14T10:57:38.566488423Z caller=main.go:320 module=certificate target=prometheus.example.com:443 level=error msg="Probe failed" duration_seconds=0.007430518
Metrics that would have been returned:
# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds
# TYPE probe_dns_lookup_time_seconds gauge
probe_dns_lookup_time_seconds 0.000288299
# HELP probe_duration_seconds Returns how long the probe took to complete in seconds
# TYPE probe_duration_seconds gauge
probe_duration_seconds 0.007430518
# HELP probe_failed_due_to_regex Indicates if probe failed due to regex
# TYPE probe_failed_due_to_regex gauge
probe_failed_due_to_regex 0
# HELP probe_ip_addr_hash Specifies the hash of IP address. It's useful to detect if the IP address changes.
# TYPE probe_ip_addr_hash gauge
probe_ip_addr_hash 1.706353704e+09
# HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol 6
# HELP probe_success Displays whether or not the probe was a success
# TYPE probe_success gauge
probe_success 0
Module configuration:
prober: tcp
timeout: 5s
http:
ip_protocol_fallback: true
follow_redirects: true
tcp:
ip_protocol_fallback: true
tls: true
icmp:
ip_protocol_fallback: true
dns:
ip_protocol_fallback: true
real 0m0.038s
user 0m0.012s
sys 0m0.012s
What did you do that produced an error?
Create a target name with both IPv4 and IPv6 addresses, but the IPv6 address gives "unreachable"
For testing purposes I used this in /etc/hosts
:
# cat /etc/hosts
127.0.0.1 localhost
172.67.201.240 prometheus.example.com
2606:4700:1:1::9876 prometheus.example.com
# ping6 prometheus.example.com
PING prometheus.example.com(prometheus.example.com (2606:4700:1:1::9876)) 56 data bytes
From linx-lon1.as13335.net (2001:7f8:4::3417:1) icmp_seq=1 Destination unreachable: Address unreachable
From linx-lon1.as13335.net (2001:7f8:4::3417:1) icmp_seq=2 Destination unreachable: Address unreachable
From linx-lon1.as13335.net (2001:7f8:4::3417:1) icmp_seq=3 Destination unreachable: Address unreachable
What did you expect to see?
Since ip_protocol_fallback: true
is set, I expected the failed connection on IPv6 to be followed by a connection attempt on IPv4.
What did you see instead?
No attempt is made to connect on IPv4.
tcpdump shows:
# tcpdump -i eth0 -nn host 172.67.201.240 or host 2606:4700:1:1::9876 or icmp6
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:00:35.563003 IP6 XXXX:XXXX:XXXX:XXXX::33.52582 > 2606:4700:1:1::9876.443: Flags [S], seq 441588982, win 64800, options [mss 1440,sackOK,TS val 92703423 ecr 0,nop,wscale 7], length 0
11:00:35.568393 IP6 2001:7f8:4::3417:1 > XXXX:XXXX:XXXX:XXXX::33: ICMP6, destination unreachable, unreachable address 2606:4700:1:1::9876, length 88
^C
Additional info
Similar results are obtained using ip -6 route add blackhole 2001:7f8:4::3417
. In this case you get an EINVAL generated locally, instead of an icmp6 unreachable:
ts=2021-08-14T11:34:18.245170246Z caller=main.go:130 module=certificate target=prometheus.example.com:443 level=error msg="Error dialing TCP" err="dial tcp6 [2001:7f8:4::3417]:443: connect: invalid argument"
But again, there is no fallback to v4 from BBE.
As posted by @roidelapluie in https://github.com/prometheus/blackbox_exporter/issues/819#issuecomment-904590649:
ip_protocol_fallback is only for DNS resolution, and this is the expected behaviour.