kube-prometheus icon indicating copy to clipboard operation
kube-prometheus copied to clipboard

Disk Space Usage's available is wrong in Nodes Grafana dashboard

Open shubhamc183 opened this issue 3 years ago • 6 comments

I think I have hit a bug in Grafana > Dashboard > Nodes.json > Disk Space Usage(Panel) > available

It's showing me the wrong data. I think also adding fstype!="tmpfs" to query can help.

sum(
  max by (device) (
    node_filesystem_avail_bytes{job="node-exporter", instance="$instance", fstype!=""}
  )
)

Actually, I have a worker node of size 20GB, in which almost ~18GB disk is used but still, grafana shows 17.5GB available.

Because of the last tmpfs the query picks the it and shows more than available image

image

shubhamc183 avatar May 12 '21 08:05 shubhamc183

fstype selector is configurable. Setting is available in $.values.nodeExporter.mixin._config.fsSelector: 'fstype!=""'. I wonder if excluding tmpfs in the mixin can have some unpredicted consequences though :thinking:

paulfantom avatar May 12 '21 12:05 paulfantom

I am also not sure @paulfantom. Can we just add a fsSelector for Nodes dashboard?

shubhamc183 avatar May 12 '21 16:05 shubhamc183

The dashboard comes from the mixins developed in the prometheus/node-exporter repository, and is already possible to configure the dashboards with fsSelector 🙂.

https://github.com/prometheus/node_exporter/blob/f04d5569a2f611909526f497fc610ea0bef822f7/docs/node-mixin/dashboards/node.libsonnet#L144-L146

An example jsonnet file:

(import 'kube-prometheus/main.libsonnet') +
{
  values+:: {
    nodeExporter+: {
      mixin+: {
        _config+: {
          fsSelector: 'fstype!="tmpfs"',
        },
      },
    },
  },
};

ArthurSens avatar May 26 '21 22:05 ArthurSens

Not sure if would be better to have a separate issue for this. But summing up all devices seems not that helpful anyway. If I have different partitions for /var/lib/docker and / for example a graph for the sum seems a bit useless.

What would the correct way change the dashboard to support multiple devices?

spielkind avatar Jul 21 '21 11:07 spielkind

I think I'm also seeing this. I have about a 5% difference between the value reported by df -B1 and node_filesystem_free_bytes

pschichtel avatar Jan 17 '23 13:01 pschichtel

By default linux reserves 5% disk space for root only. This is so linux can continue running when the user amount (95%) is completely used. Also it has to do with disk performance as disk fragmentation increases dramatically when above 95% disk usage. Anyway, plenty to find on that subject if you search for it.

The df command takes this into consideration and reports the available size for the user (read: non-root). Prometheus reports the node_filesystem_avail_bytes for the non-root user, but it reports the node_filesystem_size_bytes for all users.

vwolf010 avatar Jan 20 '23 10:01 vwolf010