integrations-core icon indicating copy to clipboard operation
integrations-core copied to clipboard

cockroachdb: support all v23.2 metrics

Open sudomateo opened this issue 1 year ago • 2 comments

What does this PR do?

CockroachDB exposes its metrics via a Prometheus endpoint on /_status/vars. In CockroachDB v23.2 there are ~1500 metrics exposed via this endpoint.

After deduplication and removing histogram metrics (i.e., .bucket, .sum), the cockroachdb Datadog integration only supports 372 metrics per its metadata.csv file.

This patch updates the cockroachdb Datadog integration to support all of the metrics exposed by CockroachDB as of the latest stable release, which is v23.2 at the time of this writing.

For histogram metrics, the .bucket, .count, and .sum metrics are not included since it's not clear whether those need to be in the metadata.csv file. Those could be added in a future commit if necessary.

Resolves #16694.

Motivation

#16694

Additional Notes

N/A

Review checklist (to be filled by reviewers)

  • [ ] Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • [ ] Changelog entries must be created for modifications to shipped code
  • [ ] Add the qa/skip-qa label if the PR doesn't need to be tested during QA.

sudomateo avatar Jan 24 '24 02:01 sudomateo

Thank you very much for the PR! In order to fix the CI you'll need to run our linter. It can be done with ddev test cockroadb -fs. Then I believe some tests will start to fail, so it would be awesome to update them to include the new metrics.

Thank you for the review @FlorentClarret!

Just for future readers I had to run ddev config set repos.core <LOCATION_OF_CLONED_REPO> to tell ddev where to find my cloned copy of the repository. By default it was using ~/dd/integrations-core.

I ran ddev test cockroachdb -fs against my latest commit and it only output Line too long lint warnings. In my opinion we should ignore these warnings since the METRIC_MAP in metrics.py is a mapping of Prometheus metrics names to Datadog metric names and breaking up the map keys and values across newlines would arguably make the code less readable. What do you think?

> ddev test cockroachdb -fs
────────────────────────────────── CockroachDB ──────────────────────────────────
───────────────────────────────────── lint ──────────────────────────────────────
cmd [1] | black . --config ../pyproject.toml
All done! ✨ 🍰 ✨
20 files left unchanged.
cmd [2] | ruff --config ../pyproject.toml --fix .
datadog_checks/cockroachdb/metrics.py:117:121: E501 Line too long (121 > 120 characters)
datadog_checks/cockroachdb/metrics.py:135:121: E501 Line too long (129 > 120 characters)
datadog_checks/cockroachdb/metrics.py:138:121: E501 Line too long (127 > 120 characters)
...
datadog_checks/cockroachdb/metrics.py:1236:121: E501 Line too long (127 > 120 characters)
datadog_checks/cockroachdb/metrics.py:1558:121: E501 Line too long (123 > 120 characters)
datadog_checks/cockroachdb/metrics.py:1559:121: E501 Line too long (123 > 120 characters)
Found 57 errors.

Regarding the metdata.csv file, yes bucket metrics .bucket, .sum and .count should all be included.

I regenerated the metadata.csv with the .bucket, .count and .sum metrics for histograms.

sudomateo avatar Jan 30 '24 15:01 sudomateo

Resolved merge conflicts. This is ready for another review @FlorentClarret.

sudomateo avatar Feb 14 '24 00:02 sudomateo

Hey @sudomateo, sorry for the delay.

I'll need to add a couple of tests for these metrics, and update the metadata for them too. I won't bother you with that, I'll take care of them. I'm going to open a couple of PRs to add them to make it a bit easier to review for my team. I will leave comments in this PR to let you know.

Let me know if there's a problem with that. Thanks again for your help here.

FlorentClarret avatar Mar 06 '24 13:03 FlorentClarret

@FlorentClarret no worries! Let me know if you need anything from me. Excited to get this merged.

I can also provide the code that's being used to generate the metadata.csv and metrics.py mapping if that's helpful for you.

sudomateo avatar Mar 06 '24 18:03 sudomateo

Test Results

9 tests   1 :white_check_mark:  9s :stopwatch: 1 suites  7 :zzz: 1 files    1 :x:

For more details on these failures, see this check.

Results for commit 00e6d251.

:recycle: This comment has been updated with latest results.

github-actions[bot] avatar Mar 07 '24 00:03 github-actions[bot]

I provided the code that's generating the metadata.csv and I also updated metrics.py to ignore long lines for the linter.

The tests that are failing are primary due to the generated metadata.csv not having the metric_type and unit_name that Datadog expects. Happy to work with Datadog to fix this.

sudomateo avatar Mar 07 '24 00:03 sudomateo

Hi again @sudomateo 👋

I opened a bunch of PRs to refactor our test suite and split this PR to make it a bit easier to review for the team. I also added the tests, and fixed tiny other issues. ftr, all the PRs:

  • https://github.com/DataDog/integrations-core/pull/17076
  • https://github.com/DataDog/integrations-core/pull/17077
  • https://github.com/DataDog/integrations-core/pull/17078
  • https://github.com/DataDog/integrations-core/pull/17079
  • https://github.com/DataDog/integrations-core/pull/17135
  • https://github.com/DataDog/integrations-core/pull/17141
  • https://github.com/DataDog/integrations-core/pull/17155
  • https://github.com/DataDog/integrations-core/pull/17163
  • https://github.com/DataDog/integrations-core/pull/17164
  • https://github.com/DataDog/integrations-core/pull/17166
  • https://github.com/DataDog/integrations-core/pull/17183
  • https://github.com/DataDog/integrations-core/pull/17192

All of them will be shipped with Agent 7.53.0 by default.

Thanks again for your work on this, we will also credit you in the changelogs.

FlorentClarret avatar Mar 21 '24 09:03 FlorentClarret

Thank you @FlorentClarret! I'll use the updated metadata.csv in our process now.

sudomateo avatar Mar 21 '24 11:03 sudomateo