node_exporter icon indicating copy to clipboard operation
node_exporter copied to clipboard

Add neighbor stats

Open Cellophan opened this issue 8 years ago • 9 comments

Need: The node_exporter doesn't provide statistics about neighbors (the ip neighbor's way): numbers, table overload, garbade collections, limts,...

More info: The neighbors can be found via ip neigh show. The raw data is in /proc/net/stat/arp_cache and /proc/net/stat/ndisc_cache (based on that see "struct neigh_table").

Use case: I'm working with 5 hosts, more than 120 containers and I discovered in a kernel log file that I have an ARP table overload for the second time. The direct impact is a loss of packet and thus this is a root cause hard to find.

Cellophan avatar Mar 24 '17 16:03 Cellophan

I've implemented https://github.com/prometheus/node_exporter/pull/540 which sounds like it should meet the requirements of this request. It collects entry counts on a per-device basis from /proc/net/arp.

skottler avatar Apr 02 '17 18:04 skottler

@skottler Thanks!

During last week I create a shell script to put the data in the text collector:

#!/bin/bash

set -eu

: ${DIR:="/path/to/text-collector"}

LNSTAT="${DIR}/lnstat.prom"
lnstat -c 1 --json \
        | sed -e 's|{{|{|' -e 's|}}|}|' \
        | jq -r 'to_entries | .[] | "lnstat_\(.key)_\(.value | to_entries| .[] | "\(.key) \(.value)")"' \
        > ${LNSTAT}.$$
mv ${LNSTAT}.$$ ${LNSTAT}

SYS="${DIR}/sys.prom"
for f in /proc/sys/net/ipv4/neigh/default/*; do
        echo "sysctl_ipv4_neigh_default_$(basename $f) $(< $f)" \
                >>${SYS}.$$
done
mv ${SYS}.$$ ${SYS}

In this script I added some data from /proc/sys/net/ipv4/neigh/default because I realized that having the gc_thresh{1..3} is needed to understand the behavior of the metrics collected by /proc/net/stat/arp_cache.

So the question I have: is collecting the gc_* values part of this Issue or not.

Cellophan avatar Apr 04 '17 08:04 Cellophan

After re-reading: you wrote /proc/net/arp. This file is more readable than /proc/net/stat/arp_cache but I don't know what this data is.

I'm dealing with messages from the kernel telling me that my ARP table has been full. I have increased /proc/sys/net/ipv4/neigh/default/gc_thresh3 and I don't have the problem anymore for now. I assume I solved it. But as gc_thresh3 was at 2048, I don't understand why /proc/net/arp has only some entries. At least lnstat -c 1 gives me a lot more.

It's like /proc/net/arp is for my default environment, and /proc/net/stat/arp_cache compiles all tre caches of all thje containers. Thus I think the second file is better.

Cellophan avatar Apr 04 '17 08:04 Cellophan

IPv6 neighour tables and garbage collection thresholds should be considered as well.

Also the current arp collector does not expose the state (reachable|stale|failed|incomplete) as a label.

mweinelt avatar Nov 19 '17 17:11 mweinelt

@skottler Since you added the arp collector, what do you think about adding the state as label?

In general, contributions are welcome to add these things.

discordianfish avatar Aug 18 '18 09:08 discordianfish

@discordianfish I'm happy to add the label, it'd be a nice change the ARP collector. We'd discussed in a previous pull request about potentially putting the parsing code for /proc/net/arp into the procfs library and I submitted an initial version of that in https://github.com/prometheus/procfs/pull/105. Once that lands I will plan to switch over node_exporter to use it and then follow up by expanding it with better state support.

How does that sound?

skottler avatar Aug 18 '18 19:08 skottler

@skottler That sounds perfect, thanks!

discordianfish avatar Aug 19 '18 09:08 discordianfish

IPv6 neighour tables and garbage collection thresholds should be considered as well.

Also the current arp collector does not expose the state (reachable|stale|failed|incomplete) as a label.

Any update on adding these labels to arp collection? :)

naeimehmhm avatar Sep 06 '20 22:09 naeimehmhm

May I pick this up for adding state labels?

EDIT: Made a small PR. Maybe we can talk about NDP collector, I'm also wiling to implement it, if community wants to add ndp collector.

eugercek avatar Nov 23 '24 00:11 eugercek