node_exporter icon indicating copy to clipboard operation
node_exporter copied to clipboard

added md disks in down state

Open Finomosec opened this issue 1 year ago • 7 comments

Added missing mdadm stats:

  • node_md_disks # added {state="down"}
  • node_md_sync_time_remaining (seconds)
  • node_md_blocks_synced_speed
  • node_md_blocks_synced_pct

Notes:

  • One drive was not being shown, as it was in state="down" (recovering), which was not reported in the output.
  • Using node_md_blocks_synced / node_md_blocks as progress percentage created wrong results on my system, as the total-blocks differed from the total-to-be-synced-blocks. This may be due to the raid-level being used (raid5).
md0 : active raid5 sdf1[4] sde1[1] sdc1[2] sdb1[0]
      14650718208 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      [===================>.]  recovery = 99.9% (4882207424/4883572736) finish=7.8min speed=2908K/sec
      bitmap: 2/37 pages [8KB], 65536KB chunk

Finomosec avatar Apr 30 '24 00:04 Finomosec

@SuperQ I'm done for now. Feel free to merge it at any time.

Finomosec avatar May 03 '24 15:05 Finomosec

Maybe instead of exposing the sync percent, we should expose the "TODO" blocks value. This way the completion ratio can be correctly calculated as node_md_blocks_synced / node_md_blocks_synced_todo.

SuperQ avatar May 07 '24 21:05 SuperQ

Maybe instead of exposing the sync percent, we should expose the "TODO" blocks value. This way the completion ratio can be correctly calculated as node_md_blocks_synced / node_md_blocks_synced_todo.

That was my first idea, too, but the data-source (https://github.com/prometheus/procfs/blob/master/mdstat.go) does not (yet) capture/expose this value.

Also node_md_blocks_synced_todo is not a good name. todo sound like remaining, which is not correct. Maybe to_be_synced would suffice.

But hey! We could calculate it using blocks_synced and the percentage. What do you think, should we do this?

... but it might be imprecise, especially for low percentage values plus it might yield slightly different results over time, which would be kind of akward.

So maybe better not after all.

I added a request to add it: https://github.com/prometheus/procfs/issues/636

Finomosec avatar May 09 '24 09:05 Finomosec

Yeah I agree, let's add the TODO blocks to procfs

discordianfish avatar May 13 '24 11:05 discordianfish

Released updated procfs: https://github.com/prometheus/procfs/releases/tag/v0.15.0

SuperQ avatar May 14 '24 09:05 SuperQ