blackbox_exporter icon indicating copy to clipboard operation
blackbox_exporter copied to clipboard

Unable to successfully ping internal IPs

Open SnowyTerror opened this issue 2 years ago • 0 comments

Host operating system

Raspberry Pi4 running docker with portainer

blackbox container has a network type of bridge which is also linked to Prometheus and Grafana

blackbox_exporter version

0.19.0

blackbox.yml config

modules:
  http_2xx:
    prober: http
    http:
      preferred_ip_protocol: "ip4"
  http_post_2xx:
    prober: http
    http:
      method: POST
  tcp_connect:
    prober: tcp
  pop3s_banner:
    prober: tcp
    tcp:
      query_response:
      - expect: "^+OK"
      tls: true
      tls_config:
        insecure_skip_verify: false
  ssh_banner:
    prober: tcp
    tcp:
      query_response:
      - expect: "^SSH-2.0-"
  irc_banner:
    prober: tcp
    tcp:
      query_response:
      - send: "NICK prober"
      - send: "USER prober prober prober :prober"
      - expect: "PING :([^ ]+)"
        send: "PONG ${1}"
      - expect: "^:[^ ]+ 001"
  icmp:
    prober: icmp

prometheus.yml scrape config

# my global config
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # By default, scrape targets every 15 seconds.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'Alertmanager'

# Load and evaluate rules in this file every 'evaluation_interval' seconds.
rule_files:
    - 'alert.rules'
  # - "first.rules"
  # - "second.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:

  - job_name: 'prometheus'
    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 10s
    static_configs:
         - targets: ['localhost:9090']


  - job_name: 'speedtest'
    metrics_path: /metrics
    scrape_interval: 30m
    scrape_timeout: 180s # running speedtest needs time to complete
    static_configs:
      - targets: ['speedtest:9798']

  - job_name: 'ping'
    metrics_path: /probe
    scrape_interval: 10s
    params:
      module: [http_2xx]  # Look for a HTTP 200 response.
    file_sd_configs:
      - files:
        - pinghosts.yaml
    relabel_configs:
      - source_labels: [__address__]
        regex: '(.*);(.*);(.*);(.*)'  #first is the url, thus unique for instance
        target_label: instance
        replacement: $1
      - source_labels: [__address__]
        regex: '(.*);(.*);(.*);(.*)'  #second is humanname to use in charts
        target_label: humanname
        replacement: $2
      - source_labels: [__address__]
        regex: '(.*);(.*);(.*);(.*)'  #third state whether this is testing external or internal network
        target_label: routing
        replacement: $3
      - source_labels: [__address__]
        regex: '(.*);(.*);(.*);(.*)'  #fourth is which switch/router the device is connected to.
        target_label: switching
        replacement: $4
      - source_labels: [instance]
        target_label: __param_target
      - target_label: __address__
        replacement: ping:9115  # The blackbox exporter's real hostname:port.

  - job_name: 'nodeexp'
    static_configs:
    - targets: ['nodeexp:9100']

pinghosts.yaml file

- targets:  # url;humanname;routing;switch
    - http://google.com;Google;external;router
    - http://github.com;Github;external;router
    - https://steamcommunity.com;Steam;external;router
    - https://10.0.10.10;Home Assistant;internal;switch24
    - http://10.0.10.20;Traffic Monitor;internal;switch24

Log file from 1 failure

Logs for the probe:
ts=2021-12-06T01:00:03.844485785Z caller=main.go:320 module=http_2xx target=https://10.0.10.10 level=info msg="Beginning probe" probe=http timeout_seconds=9.5
ts=2021-12-06T01:00:03.844890243Z caller=http.go:335 module=http_2xx target=https://10.0.10.10 level=info msg="Resolving target address" ip_protocol=ip4
ts=2021-12-06T01:00:03.844966038Z caller=http.go:335 module=http_2xx target=https://10.0.10.10 level=info msg="Resolved target address" ip=10.0.10.10
ts=2021-12-06T01:00:03.845120703Z caller=client.go:251 module=http_2xx target=https://10.0.10.10 level=info msg="Making HTTP request" url=https://10.0.10.10 host=10.0.10.10
ts=2021-12-06T01:00:03.846306058Z caller=main.go:130 module=http_2xx target=https://10.0.10.10 level=error msg="Error for HTTP request" err="Get \"https://10.0.10.10\": dial tcp 10.0.10.10:443: connect: connection refused"
ts=2021-12-06T01:00:03.846418075Z caller=main.go:130 module=http_2xx target=https://10.0.10.10 level=info msg="Response timings for roundtrip" roundtrip=0 start=2021-12-06T01:00:03.845317515Z dnsDone=2021-12-06T01:00:03.845317515Z connectDone=2021-12-06T01:00:03.846207282Z gotConn=0001-01-01T00:00:00Z responseStart=0001-01-01T00:00:00Z tlsStart=0001-01-01T00:00:00Z tlsDone=0001-01-01T00:00:00Z end=0001-01-01T00:00:00Z
ts=2021-12-06T01:00:03.84655661Z caller=main.go:320 module=http_2xx target=https://10.0.10.10 level=error msg="Probe failed" duration_seconds=0.001967049



Metrics that would have been returned:
# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds
# TYPE probe_dns_lookup_time_seconds gauge
probe_dns_lookup_time_seconds 0.000140165
# HELP probe_duration_seconds Returns how long the probe took to complete in seconds
# TYPE probe_duration_seconds gauge
probe_duration_seconds 0.001967049
# HELP probe_failed_due_to_regex Indicates if probe failed due to regex
# TYPE probe_failed_due_to_regex gauge
probe_failed_due_to_regex 0
# HELP probe_http_content_length Length of http content response
# TYPE probe_http_content_length gauge
probe_http_content_length 0
# HELP probe_http_duration_seconds Duration of http request by phase, summed over all redirects
# TYPE probe_http_duration_seconds gauge
probe_http_duration_seconds{phase="connect"} 0
probe_http_duration_seconds{phase="processing"} 0
probe_http_duration_seconds{phase="resolve"} 0.000140165
probe_http_duration_seconds{phase="tls"} 0
probe_http_duration_seconds{phase="transfer"} 0
# HELP probe_http_redirects The number of redirects
# TYPE probe_http_redirects gauge
probe_http_redirects 0
# HELP probe_http_ssl Indicates if SSL was used for the final redirect
# TYPE probe_http_ssl gauge
probe_http_ssl 0
# HELP probe_http_status_code Response HTTP status code
# TYPE probe_http_status_code gauge
probe_http_status_code 0
# HELP probe_http_uncompressed_body_length Length of uncompressed response body
# TYPE probe_http_uncompressed_body_length gauge
probe_http_uncompressed_body_length 0
# HELP probe_http_version Returns the version of HTTP of the probe response
# TYPE probe_http_version gauge
probe_http_version 0
# HELP probe_ip_addr_hash Specifies the hash of IP address. It's useful to detect if the IP address changes.
# TYPE probe_ip_addr_hash gauge
probe_ip_addr_hash 3.85946281e+08
# HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol 4
# HELP probe_success Displays whether or not the probe was a success
# TYPE probe_success gauge
probe_success 0



Module configuration:
prober: http
http:
    preferred_ip_protocol: ip4
    ip_protocol_fallback: true
    follow_redirects: true
tcp:
    ip_protocol_fallback: true
icmp:
    ip_protocol_fallback: true
dns:
    ip_protocol_fallback: true

What I did

Added 2 local IP addresses, tried this in the following formats:

  • [IP address]
  • http://[IP address]
  • https://[IP address]

What I expected to see

I expected to see from Grafana that it is responding.

What did I see

I seen that only the Traffic Monitor IP address had responded but not the other internal IP address, the Traffic Monitor IP address is what the blackbox is running on.

SnowyTerror avatar Dec 06 '21 01:12 SnowyTerror