node_exporter icon indicating copy to clipboard operation
node_exporter copied to clipboard

Parallelize filesystem_linux stat calls

Open samuelnoguchi opened this issue 4 years ago • 2 comments

Parallelize linux Statfs() calls and remove hanging when NFS mounts hang. Closes issue #1760 .

A modification of work submitted here: https://github.com/prometheus/node_exporter/pull/1772

This change adds the ability to process multiple stat calls in parallel. Processing is rate-limited based on the new flag collector.filesystem.stat-workers (default 4). If 0 is provided as the flag value the number of workers will be equal to the number of mount points.

This change also removes unbounded hang times when NFS mounts hang during stat calls. Previously, node exporter would hang on the first unresponsive NFS stat calls until the NFS mount recovers. This change causes the filesystem_linux collector to return after the specified timeout during unresponsive NFS stat calls. This functionality can be most easily tested by blocking the NFS port (usually 2049) to simulate NFS stat call hangs using iptables:

sudo iptables -A INPUT -p tcp --destination-port 2049 -j DROP

Signed-off-by: Samuel Noguchi [email protected]

samuelnoguchi avatar Aug 02 '21 21:08 samuelnoguchi

Would love to see this landed, its' quite a pain to have spotty metrics due to an unstable NFS connection!

gerardba avatar Sep 30 '21 21:09 gerardba

Ping, this needs a rebase.

SuperQ avatar Oct 27 '21 12:10 SuperQ

@samuelnoguchi please rebase, this would be fantastic for systems with many mounts (multiple thousands)

robbat2 avatar Sep 16 '23 16:09 robbat2

Looks like this can be closed because #1772 was merged

nayfield avatar Sep 18 '23 13:09 nayfield