checkmk icon indicating copy to clipboard operation
checkmk copied to clipboard

ceph_df: Fix Ceph Pool usage calculation

Open jkirk opened this issue 2 years ago • 7 comments

General information

The Ceph Pool usage calculation is incorrect.

Bug report

Lets take the following ceph df detail output:

  % sudo ceph df detail
  --- RAW STORAGE ---
  CLASS  SIZE    AVAIL    USED    RAW USED  %RAW USED
  hdd    20 TiB  5.2 TiB  15 TiB    15 TiB      73.94
  TOTAL  20 TiB  5.2 TiB  15 TiB    15 TiB      73.94

  --- POOLS ---
  POOL                   ID  PGS  STORED   (DATA)   (OMAP)   OBJECTS  USED    (DATA)  (OMAP)  %USED  MAX AVAIL  QUOTA OBJECTS  QUOTA BYTES  DIRTY  USED COMPR  UNDER COMPR
  lava                    1  512  4.8 TiB  4.8 TiB  6.1 MiB    1.27M  14 TiB  14 TiB  18 MiB  86.58    763 GiB  N/A            N/A            N/A         0 B          0 B
  device_health_metrics   2    1  8.6 MiB      0 B  8.6 MiB       12  26 MiB     0 B  26 MiB      0    763 GiB  N/A            N/A            N/A         0 B          0 B

The ceph pool lava has a %USED (percent_used if used with the '--format json' option) value of 86.58.

The corresponding checkmk agent output with the mk_ceph plugin then looks like this:

  <<<ceph_df_json:sep(0)>>>

  {"version":"ceph version 15.2.16 (a6b69e817d6c9e6f02d0a7ac3043ba9cdbda1bdf) octopus (stable)"}
  {"stats":{"total_bytes":21990790324224,"total_avail_bytes":5730461941760,"total_used_bytes":16250664706048,"total_used_raw_bytes":16260328382464,"total_used_raw_ratio":0.73941540718078613,"num_osds":9,"num_per_pool_osds":9,"num_per_pool_omap_osds":9},"stats_by_class":{"hdd":{"total_bytes":21990790324224,"total_avail_bytes":5730461941760,"total_used_bytes":16250664706048,"total_used_raw_bytes":16260328382464,"total_used_raw_ratio":0.73941540718078613}},"pools":[{"name":"lava","id":1,"stats":{"stored":5283427172086,"stored_data":5283420569600,"stored_omap":6602486,"objects":1269198,"kb_used":15491675024,"bytes_used":15863475223778,"data_bytes_used":15863455416320,"omap_bytes_used":19807458,"percent_used":0.86584043502807617,"max_avail":819333627904,"quota_objects":0,"quota_bytes":0,"dirty":0,"rd":4304064608,"rd_bytes":1073575795907584,"wr":4725498063,"wr_bytes":136168724998144,"compress_bytes_used":0,"compress_under_bytes":0,"stored_raw":15850282156032,"avail_raw":2458000764220}},{"name":"device_health_metrics","id":2,"stats":{"stored":9023499,"stored_data":0,"stored_omap":9023499,"objects":12,"kb_used":26437,"bytes_used":27070497,"data_bytes_used":0,"omap_bytes_used":27070497,"percent_used":1.1013095900125336e-05,"max_avail":819333627904,"quota_objects":0,"quota_bytes":0,"dirty":0,"rd":3501,"rd_bytes":10402816,"wr":4752,"wr_bytes":12901376,"compress_bytes_used":0,"compress_under_bytes":0,"stored_raw":27070496,"avail_raw":2458000764220}}]}

This then gets parsed like this:

  fs_used=15128589.077364;15069285.192661;15489625.853762;0;15909966.514864 fs_size=15909966.514864;;;; fs_used_percent=95.088755;;;; growth=274.261788;;;; trend=3667.14612;;;0;662915.271453

As one can see fs_used_percent has a value of 95.088755, which is definitely not equal to %USED (percent_used) shown above. This is wrong.

Proposed changes

We now have two options to calculate the actual pool usage:

  • bytes_used / (avail_raw + bytes_used) -> 0.86584 (= 86.58%)
  • stored / (stored + max_avail) -> 0.86574 (= 86.57%)

with:

  • 'avail_raw' is "the amount of free space available in the cluster".

  • 'bytes_used' is "the space allocated for a pool over all OSDs. This includes replication, allocation granularity, and erasure-coding overhead".

  • 'max_avail' is "an estimate of the notional amount of data that can be written to this pool".

    also:

    The MAX AVAIL value is a complicated function of the replication or erasure code used, the CRUSH rule that maps storage to devices, the utilization of those devices, and the configured mon_osd_full_ratio.

  • 'stored' is the "actual amount of data user/Ceph has stored in a pool".

See:

  • https://docs.ceph.com/en/octopus/rados/operations/monitoring/#checking-a-cluster-s-usage-stats
  • https://docs.ceph.com/en/latest/rados/operations/monitoring/#checking-a-cluster-s-usage-stats

'avail_raw' and 'bytes_used' seem to be more accurate, so I decided to go with these values to calculate the Ceph Pool usage.

This patch applies to the 2.0 branch of checkmk. A patch against the master- and 2.1-branch is actually simple. I can provide one, if this one gets accepted.

jkirk avatar Jan 31 '23 17:01 jkirk

Our own Ceph extension does it a little bit different as sometimes you do not have all needed values from a pool.

https://github.com/HeinleinSupport/check_mk_extensions/blob/cmk2.1/ceph/lib/check_mk/base/plugins/agent_based/cephdf.py#L80

gurubert avatar May 05 '23 09:05 gurubert

Cool, just learned about about your(!) Ceph statistics plugin[^1] today via: https://checkmk.com/blog/proxmox-monitoring. 💪🏾 👍🏾

Do I read your code correctly, that if the Ceph Pool is full (aka max_avail == 0), you (have to) opt for percent_used calculation (because max_avail is missing)?

And further, it can happen, that no stored section exists? I assume, this happens if nothing is stored in the Ceph Pool? Shouldn't percent_used then be 0? Oh, and then size_mb = used_mb / stats['percent_used'] would also lead to an ZeroDivisionError, no? This might be a problem... 🤷🏾

I've never used Checkmk Exchange packages. I'll try to find out how to do that and try to give some feedback. Thanks for your comment + plugin(s)!

[^1]: The "Website" link in the Chechmk Exchange is broken (it leads to https://github.com/HeinleinSupport/check_mk/tree/master/ceph -> 404 - page not found. You might want to fix it to https://github.com/HeinleinSupport/check_mk_extensions/tree/cmk2.1/ceph or so).

jkirk avatar May 10 '23 19:05 jkirk

Just noticed a typo in my commit message and comment. It should read:

  • bytes_used / (avail_raw + bytes_used)

instead of:

  • avail_raw / (avail_raw + bytes_used)

Just edited my comment above and force pushed my fixed commit.

jkirk avatar May 10 '23 20:05 jkirk

Hi, thank you for uploading a PR. I'm not sure if your PR is ready after the discussion with @gurubert or if you want to adapt furthers things. Please let me know.

si-23 avatar Sep 27 '23 10:09 si-23

@si-23 Thanks for your response.

From my POV, this PR is ready as is because it fixes the Ceph pool usage calculation.

The check can be improved or even replaced by @gurubert's work, but that is a task for another day.

Meanwhile, we have moved to checkmk 2.2.0, where the Ceph pool usage calculation also seems to be broken. I'll try to come up with a fix in a separate PR.

jkirk avatar Oct 13 '23 13:10 jkirk

Hi everyone, this PR would also solve my issue and hasn't been updated for months. Could it be reviewed and merged soon? I'm happy to assist if any additional work is needed. Thanks!

lukas-fichtner avatar May 27 '24 13:05 lukas-fichtner

Although v2.0 is quite outdated, it would be nice to have this patch included.

And while doing so, porting this to checkmk 2.2+.

I'd also be happy to assist in anyway!

jkirk avatar May 31 '24 16:05 jkirk

Hi @jkirk, I am closing this PR. Meanwhile we have mainlined @gurubert's Ceph integration into Checkmk, to replace ours. Our resources are quite limited and I'd prefer to not invest any time in a plugin that is deprecated now -- I hope you understand. Nevertheless thank you for your time and effort!

mo-ki avatar Sep 10 '24 09:09 mo-ki