smokeping_prober
smokeping_prober copied to clipboard
Heatmap is empty for all hosts
The Packet Loss and Latency charts plot fine. Screenshot: https://imgur.com/a/QlZJCnS
Invoked via systemd as:
smokeping_prober --privileged --config.file=/path/to/smokeping_prober.yaml --web.listen-address=:9374 --web.telemetry-path="/metrics"
Also tried with:
--buckets="5e-05,0.0001,0.0002,0.0004,0.0008,0.0016,0.0032,0.0064,0.0128,0.0256,0.0512,0.1024,0.2048,0.4096,0.8192,1.6384,3.2768,6.5536,13.1072,26.2144"
Logs show no error:
May 03 17:48:28 graf smokeping_prober[943298]: ts=2023-05-04T00:48:28.264Z caller=main.go:202 level=info msg="Starting prober" address=138.199.4.164 interval=1s size_bytes=56
May 03 17:48:28 graf smokeping_prober[943298]: ts=2023-05-04T00:48:28.291Z caller=main.go:220 level=info msg="Listening on" address=:9374
May 03 17:48:28 graf smokeping_prober[943298]: ts=2023-05-04T00:48:28.291Z caller=tls_config.go:195 level=info msg="TLS is disabled." http2=false
lines 1-21/21 (END)
Config:
targets:
- hosts:
- 89.187.177.134 # NYC
- xxx 30 more hosts # NYC
interval: 1s
network: ip
protocol: icmp
size: 56
System info:
$ go version
go version go1.15.15 linux/amd64
$ uname -a
Linux graf 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21) x86_64 GNU/Linux
See the source
ip in the smokeping_prober.yaml
file. It may be struggling with the routing because of it. Try commenting it out.
# source: 127.0.1.1 # Souce IP address to use. Default: None (automatic selection)
I'm seeing the same issue, it doesn't seem like adding a source matters here. I'm using docker and the following yaml.
targets:
- hosts:
- 8.8.8.8
- 1.1.1.1
interval: 1s # Duration, Default 1s.
network: ip4 # One of ip, ip4, ip6. Default: ip (automatic IPv4/IPv6)
protocol: icmp # One of icmp, udp. Default: icmp (Requires privileged operation)
size: 56 # Packet data size in bytes. Default 56 (Range: 24 - 65535)
# source: # Souce IP address to use. Default: None (automatic selection)
Prometheus grabbing metrics:
# HELP smokeping_requests_total Number of ping requests sent
# TYPE smokeping_requests_total counter
smokeping_requests_total{host="1.1.1.1",ip="1.1.1.1",source=""} 38
smokeping_requests_total{host="8.8.8.8",ip="8.8.8.8",source=""} 38
# HELP smokeping_response_duplicates_total The number of duplicated response packets.
# TYPE smokeping_response_duplicates_total counter
smokeping_response_duplicates_total{host="1.1.1.1",ip="1.1.1.1",source=""} 0
smokeping_response_duplicates_total{host="8.8.8.8",ip="8.8.8.8",source=""} 0
# HELP smokeping_response_duration_seconds A histogram of latencies for ping responses.
# TYPE smokeping_response_duration_seconds histogram
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="5e-05"} 0
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.0001"} 0
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.0002"} 0
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.0004"} 0
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.0008"} 0
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.0016"} 0
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.0032"} 4
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.0064"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.0128"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.0256"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.0512"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.1024"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.2048"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.4096"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="0.8192"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="1.6384"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="3.2768"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="6.5536"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="13.1072"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="26.2144"} 37
smokeping_response_duration_seconds_bucket{host="1.1.1.1",ip="1.1.1.1",source="",le="+Inf"} 37
smokeping_response_duration_seconds_sum{host="1.1.1.1",ip="1.1.1.1",source=""} 0.13052124300000004
smokeping_response_duration_seconds_count{host="1.1.1.1",ip="1.1.1.1",source=""} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="5e-05"} 0
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.0001"} 0
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.0002"} 0
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.0004"} 0
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.0008"} 0
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.0016"} 0
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.0032"} 17
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.0064"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.0128"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.0256"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.0512"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.1024"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.2048"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.4096"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="0.8192"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="1.6384"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="3.2768"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="6.5536"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="13.1072"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="26.2144"} 37
smokeping_response_duration_seconds_bucket{host="8.8.8.8",ip="8.8.8.8",source="",le="+Inf"} 37
smokeping_response_duration_seconds_sum{host="8.8.8.8",ip="8.8.8.8",source=""} 0.11810808400000002
smokeping_response_duration_seconds_count{host="8.8.8.8",ip="8.8.8.8",source=""} 37
# HELP smokeping_response_ttl The last response Time To Live (TTL).
# TYPE smokeping_response_ttl gauge
smokeping_response_ttl{host="1.1.1.1",ip="1.1.1.1",source=""} 56
smokeping_response_ttl{host="8.8.8.8",ip="8.8.8.8",source=""} 117
# HELP smokeping_send_errors_total The number of errors when Pinger attempts to send packets.
# TYPE smokeping_send_errors_total counter
smokeping_send_errors_total{host="1.1.1.1",ip="1.1.1.1",source=""} 0
smokeping_send_errors_total{host="8.8.8.8",ip="8.8.8.8",source=""} 0
I'm not sure what is wrong the query for the heatmap based on this info.
Did you try to use the dashboard.json from this repository? Works for me out of the box.
The heatmap query looks like this. I modified it a little bit but the original from repository works and this one, too.
sum(rate(smokeping_response_duration_seconds_bucket{host=~"$target"}[1m])) by (le)
the rate is [1m] so you need to scrape the smokeping_prober at least every 30s from prometheus to get results.
@Nachtfalkeaw, yep that's the one. Neither your query nor the original works for me at least. Pings are at 1s interval, switched to 15s just for fun and it is the same as expected.
it works if it's 2m instead of 1m, not sure why... what am I missing
My scrape interval was too low in prometheus (as you mentioned) (new to prometheus)
I just meet the same promble while try to start smokeping_prober by nologin user. Run it by root ok