agent icon indicating copy to clipboard operation
agent copied to clipboard

Fix issue with config reload when using a log pipeline with a metric stage

Open ptodev opened this issue 1 year ago • 0 comments

PR Description

There is a longstanding bug which prevents the agent config from being reloaded when there is a metrics stage in a logging pipeline.

This PR also fixes another bug during config reloads. When a reload is triggered, static mode mistakenly thinks that the the new config doesn't match the old one (and hence triggers a reload) for two reasons:

  • The X-Agent-Id header isn't in the user-defined config.
  • In the scrape_configs/static_configs section, there might be a targets: [localhost] which was added by default by Promtail because the user didn't set it explicitly

For example, when using static_mode.yaml.txt, the Agent tries interprets the new config as example_incoming.yaml.txt. This doesn't match the way it thinks the "current" config that it uses is defined(example_running.yaml.txt), so it does a config reload. This issue is not present in Flow - it's only in static mode.

There might also be similar issues that cause mismatch in configs. It's possible that I don't see all such issues, since my config file is using very few functionality. The only sure way of eliminating these issues that I can think of is to store the incoming config in a string, as this PR is doing.

Which issue(s) this PR fixes

Fixes #2754

Notes to the Reviewer

A few extra things need to be done:

  • We need unit tests for static mode which do a config reload, similar to the unit test which this PR creates for Flow mode.
  • The "config not matching" bug has only been fixed for logs pipelines. It needs to be fixed for all pipelines (e.g. metrics, traces, etc). Otherwise there's a chance that the metrics from the logs stage will be reset. If the users are running a config reloader sidecar which reloads the config every X seconds, this can render the metrics stage unusable.

This bug was fixed in Promtail, and the Promtail fix was ported to Agent. However, due to the additional complexity of the Agent it wasn't sufficient to simply port the Promtail code.

This bugfix should also be done in Alloy.

PR Checklist

  • [ ] CHANGELOG.md updated
  • [x] Documentation added
  • [ ] Tests updated
  • [ ] Config converters updated

ptodev avatar Jun 27 '24 15:06 ptodev