node_exporter
node_exporter copied to clipboard
Parallelize filesystem_linux stat calls
Parallelize linux Statfs() calls and remove hanging when NFS mounts hang. Closes issue #1760 .
A modification of work submitted here: https://github.com/prometheus/node_exporter/pull/1772
This change adds the ability to process multiple stat calls in parallel. Processing is rate-limited based on the new flag collector.filesystem.stat-workers (default 4). If 0 is provided as the flag value the number of workers will be equal to the number of mount points.
This change also removes unbounded hang times when NFS mounts hang during stat calls. Previously, node exporter would hang on the first unresponsive NFS stat calls until the NFS mount recovers. This change causes the filesystem_linux collector to return after the specified timeout during unresponsive NFS stat calls. This functionality can be most easily tested by blocking the NFS port (usually 2049) to simulate NFS stat call hangs using iptables:
sudo iptables -A INPUT -p tcp --destination-port 2049 -j DROP
Signed-off-by: Samuel Noguchi [email protected]
Would love to see this landed, its' quite a pain to have spotty metrics due to an unstable NFS connection!
Ping, this needs a rebase.
@samuelnoguchi please rebase, this would be fantastic for systems with many mounts (multiple thousands)
Looks like this can be closed because #1772 was merged