procfs
procfs copied to clipboard
Conntrack parsing issue "/proc/net/stat/nf_conntrack" does not exist
I'm running a full ipv6 linux host with nftables (iptables replacement)
I'm notice that conntrack node-exporter collector is failing on all hosts because "conntrack probably not loaded"
I've followed the source code until line https://github.com/prometheus/node_exporter/blob/4d0c1650b5a8f9c653188a18c8d36354365b4720/collector/conntrack_linux.go#L130
A little strace show this failure in opening a folder to fetch stats from procfs
openat(AT_FDCWD, "/proc/sys/net/netfilter/nf_conntrack_count", O_RDONLY|O_CLOEXEC) = 8
epoll_ctl(4, EPOLL_CTL_ADD, 8, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2864231944, u64=140315150824968}}) = 0
fcntl(8, F_GETFL) = 0x8000 (flags O_RDONLY|O_LARGEFILE)
fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 0
fstat(8, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(8, "385\n", 512) = 4
read(8, "", 508) = 0
epoll_ctl(4, EPOLL_CTL_DEL, 8, 0xc000204b2c) = 0
close(8) = 0
openat(AT_FDCWD, "/proc/sys/net/netfilter/nf_conntrack_max", O_RDONLY|O_CLOEXEC) = 8
epoll_ctl(4, EPOLL_CTL_ADD, 8, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2864231944, u64=140315150824968}}) = 0
fcntl(8, F_GETFL) = 0x8000 (flags O_RDONLY|O_LARGEFILE)
fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 0
fstat(8, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
read(8, "32768\n", 512) = 6
read(8, "", 506) = 0
epoll_ctl(4, EPOLL_CTL_DEL, 8, 0xc000204b2c) = 0
close(8) = 0
newfstatat(AT_FDCWD, "/proc", {st_mode=S_IFDIR|0555, st_size=0, ...}, 0) = 0
openat(AT_FDCWD, "/proc/net/stat/nf_conntrack", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
write(2, "level=debug ts=2021-08-08T22:01:"..., 126level=debug ts=2021-08-08T22:01:14.597Z caller=conntrack_linux.go:156 collector=conntrack msg="conntrack probably not loaded"
) = 126
write(2, "level=debug ts=2021-08-08T22:01:"..., 174level=debug ts=2021-08-08T22:01:14.597Z caller=collector.go:167 msg="collector returned no data" name=conntrack duration_seconds=0.001267422 err="collector returned no data"
) = 174
It's seam that file /proc/net/stat/nf_conntrack does not exist and procfs project try to open it https://github.com/prometheus/procfs/blob/5162bec877a860b5ff140b5d13db31ebb0643dd3/net_conntrackstat.go#L43
It's strange because, nftables seems to use the netfilter conntrack module
root@app1:~# lsmod | grep conn
nf_conntrack_netlink 32768 0
nf_conntrack 86016 3 nf_nat,nft_ct,nf_conntrack_netlink
nf_defrag_ipv6 12288 1 nf_conntrack
nf_defrag_ipv4 12288 1 nf_conntrack
libcrc32c 12288 2 nf_conntrack,nf_nat
root@app:~# cat /proc/sys/net/netfilter/nf_conntrack_count
385
But without producing the same proc/ pseudo file at /proc/net/stat/nf_conntrack
I'm running linux version Linux 5.4.0-1044-kvm #46-Ubuntu SMP Wed Jul 14 21:36:50 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
https://github.com/prometheus/procfs/blob/5162bec877a860b5ff140b5d13db31ebb0643dd3/net_conntrackstat.go#L83
Nothing here seems to replace the missing counters file https://www.kernel.org/doc/html/latest/networking/nf_conntrack-sysctl.html
I've found this old kernel patch https://patchwork.ozlabs.org/project/netfilter-devel/patch/20120627110147.GA25605@1984/ related to how "/proc/net/stat/nf_conntrack" is produced nowadays. I can understand that the module nfnetlink_conntrack is responsible to produce theses counters, but according to my previous "lsmod" It seems to be required by nftables as for netfilter so it is also enabled.
I don't think there is an issue with procfs or node-exporter. If stat file is missing the node exported will print debug message "conntrack probably not loaded" but maybe the message is slightly misdealing, because some metrics exists and module nf_conntrack is loaded. But if stat file is here it will return it as well. Also it's not clear to me how to turn on/off stat file presence, looks like nf_conntrack module is not enough.
# HELP node_nf_conntrack_entries Number of currently allocated flow entries for connection tracking.
# TYPE node_nf_conntrack_entries gauge
node_nf_conntrack_entries 45
# HELP node_nf_conntrack_entries_limit Maximum size of connection tracking table.
# TYPE node_nf_conntrack_entries_limit gauge
node_nf_conntrack_entries_limit 262144
I discovered a related issue recently (no errors but stats not being exposed) and found this link https://serverfault.com/questions/1080112/statistics-proc-net-stat-nf-conntrack-is-missing-on-linux-server. It seems the proc nf_conntrack
file is disabled by default on a lot of newer kernels and the replacement is userspace conntrack
binary.
I would love to see this functionality, but it looks like we would have to execute and parse the output of this application from what I can tell. Any thoughts?
We don't allow executing other binaries in the node-exporter. If there is no e.g kernel interface to it, this would need to be a separate exporter or use the textfile collector.
I was thinking about this more last night and came to the same conclusion. I'll tinker around with the textfile collector and see what comes of it. Thanks for the input!
We could probably get the conntrack stats via https://github.com/ti-mo/conntrack in the node-exporter if someone wants to take a stab at that.
What about the fact that when the stat
file is missing the collector_success{collector="conntrack"}
metrics is set to 0 ( failed ) ?
in my case this is of course triggering alerts, since on nftables
we don't expect this stat file to be available should we avoid failing the whole collector when only https://github.com/prometheus/node_exporter/blob/6a0598e563213ae95edb6f5152ffc8156abd0932/collector/conntrack_linux.go#L131 fails ?
You can disable the collector explicitly. But right, we might consider not setting collector_success = 0 but that would be an issue for the node-exporter if someone wants to fill it. I don't think there is anything in procfs to do, so closing the issue.