procfs icon indicating copy to clipboard operation
procfs copied to clipboard

Conntrack parsing issue "/proc/net/stat/nf_conntrack" does not exist

Open Turgon37 opened this issue 3 years ago • 7 comments

I'm running a full ipv6 linux host with nftables (iptables replacement)

I'm notice that conntrack node-exporter collector is failing on all hosts because "conntrack probably not loaded"

I've followed the source code until line https://github.com/prometheus/node_exporter/blob/4d0c1650b5a8f9c653188a18c8d36354365b4720/collector/conntrack_linux.go#L130

A little strace show this failure in opening a folder to fetch stats from procfs

openat(AT_FDCWD, "/proc/sys/net/netfilter/nf_conntrack_count", O_RDONLY|O_CLOEXEC) = 8
epoll_ctl(4, EPOLL_CTL_ADD, 8, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2864231944, u64=140315150824968}}) = 0
fcntl(8, F_GETFL)                       = 0x8000 (flags O_RDONLY|O_LARGEFILE)
fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 0
fstat(8, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(8, "385\n", 512)                   = 4
read(8, "", 508)                        = 0
epoll_ctl(4, EPOLL_CTL_DEL, 8, 0xc000204b2c) = 0
close(8)                                = 0
openat(AT_FDCWD, "/proc/sys/net/netfilter/nf_conntrack_max", O_RDONLY|O_CLOEXEC) = 8
epoll_ctl(4, EPOLL_CTL_ADD, 8, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2864231944, u64=140315150824968}}) = 0
fcntl(8, F_GETFL)                       = 0x8000 (flags O_RDONLY|O_LARGEFILE)
fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 0
fstat(8, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
read(8, "32768\n", 512)                 = 6
read(8, "", 506)                        = 0
epoll_ctl(4, EPOLL_CTL_DEL, 8, 0xc000204b2c) = 0
close(8)                                = 0
newfstatat(AT_FDCWD, "/proc", {st_mode=S_IFDIR|0555, st_size=0, ...}, 0) = 0
openat(AT_FDCWD, "/proc/net/stat/nf_conntrack", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
write(2, "level=debug ts=2021-08-08T22:01:"..., 126level=debug ts=2021-08-08T22:01:14.597Z caller=conntrack_linux.go:156 collector=conntrack msg="conntrack probably not loaded"
) = 126
write(2, "level=debug ts=2021-08-08T22:01:"..., 174level=debug ts=2021-08-08T22:01:14.597Z caller=collector.go:167 msg="collector returned no data" name=conntrack duration_seconds=0.001267422 err="collector returned no data"
) = 174

It's seam that file /proc/net/stat/nf_conntrack does not exist and procfs project try to open it https://github.com/prometheus/procfs/blob/5162bec877a860b5ff140b5d13db31ebb0643dd3/net_conntrackstat.go#L43

It's strange because, nftables seems to use the netfilter conntrack module

root@app1:~# lsmod | grep conn
nf_conntrack_netlink    32768  0
nf_conntrack           86016  3 nf_nat,nft_ct,nf_conntrack_netlink
nf_defrag_ipv6         12288  1 nf_conntrack
nf_defrag_ipv4         12288  1 nf_conntrack
libcrc32c              12288  2 nf_conntrack,nf_nat
root@app:~# cat /proc/sys/net/netfilter/nf_conntrack_count 
385

But without producing the same proc/ pseudo file at /proc/net/stat/nf_conntrack I'm running linux version Linux 5.4.0-1044-kvm #46-Ubuntu SMP Wed Jul 14 21:36:50 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

https://github.com/prometheus/procfs/blob/5162bec877a860b5ff140b5d13db31ebb0643dd3/net_conntrackstat.go#L83

Turgon37 avatar Aug 08 '21 22:08 Turgon37

Nothing here seems to replace the missing counters file https://www.kernel.org/doc/html/latest/networking/nf_conntrack-sysctl.html

Turgon37 avatar Aug 08 '21 22:08 Turgon37

I've found this old kernel patch https://patchwork.ozlabs.org/project/netfilter-devel/patch/20120627110147.GA25605@1984/ related to how "/proc/net/stat/nf_conntrack" is produced nowadays. I can understand that the module nfnetlink_conntrack is responsible to produce theses counters, but according to my previous "lsmod" It seems to be required by nftables as for netfilter so it is also enabled.

Turgon37 avatar Aug 08 '21 22:08 Turgon37

I don't think there is an issue with procfs or node-exporter. If stat file is missing the node exported will print debug message "conntrack probably not loaded" but maybe the message is slightly misdealing, because some metrics exists and module nf_conntrack is loaded. But if stat file is here it will return it as well. Also it's not clear to me how to turn on/off stat file presence, looks like nf_conntrack module is not enough.

# HELP node_nf_conntrack_entries Number of currently allocated flow entries for connection tracking.
# TYPE node_nf_conntrack_entries gauge
node_nf_conntrack_entries 45
# HELP node_nf_conntrack_entries_limit Maximum size of connection tracking table.
# TYPE node_nf_conntrack_entries_limit gauge
node_nf_conntrack_entries_limit 262144

binjip978 avatar Sep 12 '21 07:09 binjip978

I discovered a related issue recently (no errors but stats not being exposed) and found this link https://serverfault.com/questions/1080112/statistics-proc-net-stat-nf-conntrack-is-missing-on-linux-server. It seems the proc nf_conntrack file is disabled by default on a lot of newer kernels and the replacement is userspace conntrack binary.

I would love to see this functionality, but it looks like we would have to execute and parse the output of this application from what I can tell. Any thoughts?

taintedkernel avatar Mar 16 '22 20:03 taintedkernel

We don't allow executing other binaries in the node-exporter. If there is no e.g kernel interface to it, this would need to be a separate exporter or use the textfile collector.

discordianfish avatar Mar 22 '22 11:03 discordianfish

I was thinking about this more last night and came to the same conclusion. I'll tinker around with the textfile collector and see what comes of it. Thanks for the input!

taintedkernel avatar Mar 22 '22 15:03 taintedkernel

We could probably get the conntrack stats via https://github.com/ti-mo/conntrack in the node-exporter if someone wants to take a stab at that.

discordianfish avatar Mar 23 '22 11:03 discordianfish

What about the fact that when the stat file is missing the collector_success{collector="conntrack"} metrics is set to 0 ( failed ) ?

in my case this is of course triggering alerts, since on nftables we don't expect this stat file to be available should we avoid failing the whole collector when only https://github.com/prometheus/node_exporter/blob/6a0598e563213ae95edb6f5152ffc8156abd0932/collector/conntrack_linux.go#L131 fails ?

primeroz avatar Feb 02 '23 15:02 primeroz

You can disable the collector explicitly. But right, we might consider not setting collector_success = 0 but that would be an issue for the node-exporter if someone wants to fill it. I don't think there is anything in procfs to do, so closing the issue.

discordianfish avatar Mar 07 '23 12:03 discordianfish