agent icon indicating copy to clipboard operation
agent copied to clipboard

Add metrics for config reloads and config hash

Open jcreixell opened this issue 3 years ago • 1 comments

PR Description

  • Add metrics for config reloads
    • Adds gauges to help debug issues with config reloads
  • Add metric reporting config hash
    • Adds a gauge reporting a sha256 hash of the config (before expansion) via label. This should be useful to debug config reload issues.
    • The gauge is implemented for agent, agentctl and agent_flow and supports static, dynamic and remote config.

Which issue(s) this PR fixes

Fixes #2040

Notes to the Reviewer

  • I think I instrumented the code in the right places but might have missed something, please double check
  • I had to namespace the metric name for flow as I would get a double registration otherwise. Amy ideas on how to do this better without too much complexity?
  • Is there a risk of cardinality explosion for the hash metric? (for example with dynamic configs)
  • I assume we don't test internal metrics
  • I assume we don't document these metrics, please correct if I am wrong
  • Anything else I missed?

PR Checklist

  • [x] CHANGELOG updated

jcreixell avatar Sep 15 '22 16:09 jcreixell

@mattdurham I have addressed your comments, please have another look when you have time. The only remaining issue I can think of is that both flow and non-flow metrics are exposed, however I am not sure that fixing this is worth the hassle. Thank you!

jcreixell avatar Sep 19 '22 20:09 jcreixell

@mattdurham See my last commit for a refactoring moving the duplicated code into a new package.

I tried the simple solution of manually registering the metrics during execution, but it was quite intrusive, especially in config.go (since Load can be called multiple times, I would need to register the metrics outside that package and pass them as a parameter). In the end, I opted for using a singleton instance for the metrics, reused both in flow and regular agent.

Location of this new package and naming are up for discussion.

jcreixell avatar Sep 26 '22 20:09 jcreixell

@grafana/grafana-agent-maintainers I think I am done here, feel free to merge if you are ok with my last commits and comment

jcreixell avatar Oct 05 '22 17:10 jcreixell