gluster-metrics-exporter icon indicating copy to clipboard operation
gluster-metrics-exporter copied to clipboard

Cannot get data from one node

Open webfrank opened this issue 3 years ago • 1 comments
trafficstars

Hi, I have a 3 node gluster with gluster-metrics-exporter (0.3.1) running. I would like to get brick uptime for each node and I expect each node to export own brick data but this is what I get:

ovh3 (first node)

# HELP brick_uptime_seconds Brick Uptime in Seconds
# TYPE brick_uptime_seconds gauge
brick_uptime_seconds{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="0", hostname="ovh3", path="/mnt/gfs/data1/brick1"} 33081825.0
brick_uptime_seconds{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="0", hostname="ovh5", path="/mnt/gfs/data2/brick2"} 32977767.0
brick_uptime_seconds{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="1", hostname="ovh3", path="/mnt/gfs/data2/brick2"} 33081825.0
brick_uptime_seconds{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="1", hostname="ovh5", path="/mnt/gfs/data1/brick1"} 32977767.0

I have the metrics from the third node (ohv5)

ovh4 (second node)

# HELP brick_uptime_seconds Brick Uptime in Seconds
# TYPE brick_uptime_seconds gauge
brick_uptime_seconds{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="0", hostname="ovh5", path="/mnt/gfs/data2/brick2"} 32977988.0
brick_uptime_seconds{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="1", hostname="ovh5", path="/mnt/gfs/data1/brick1"} 32977988.0

I have metrics from ovh5 (not the expected ovh4)

ovh5 (third node)

# HELP brick_uptime_seconds Brick Uptime in Seconds
# TYPE brick_uptime_seconds gauge
brick_uptime_seconds{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="0", hostname="ovh5", path="/mnt/gfs/data2/brick2"} 32978040.0
brick_uptime_seconds{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="1", hostname="ovh5", path="/mnt/gfs/data1/brick1"} 32978040.0

Correct metrics for ovh5

From the metrics I gather I have no metrics for bricks on ovh4, why? On what depends?

webfrank avatar Jan 26 '22 07:01 webfrank

Just for completeness:

# HELP brick_health Brick Health
# TYPE brick_health gauge
brick_health{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="0", hostname="ovh3", path="/mnt/gfs/data1/brick1"} 1.0
brick_health{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="0", hostname="ovh4", path="/mnt/gfs/data1/brick1"} 1.0
brick_health{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="0", hostname="ovh5", path="/mnt/gfs/data2/brick2"} 1.0
brick_health{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="1", hostname="ovh4", path="/mnt/gfs/data2/brick2"} 1.0
brick_health{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="1", hostname="ovh3", path="/mnt/gfs/data2/brick2"} 1.0
brick_health{volume_type="Distributed-Replicate", volume_state="Started", volume_name="glfs", subvol_index="1", hostname="ovh5", path="/mnt/gfs/data1/brick1"} 1.0

All bricks are up and online

webfrank avatar Jan 26 '22 07:01 webfrank