telegraf icon indicating copy to clipboard operation
telegraf copied to clipboard

plugin entry collection_jitter doesn't take precedence over [agent] collection_jitter

Open iridos opened this issue 1 year ago • 5 comments

Relevant telegraf.conf

  1. non-working example
[global_tags]
[agent]
  interval = "60s"
  round_interval = true
  collection_jitter = "10s"
  omit_hostname = true
[[ inputs.exec]]
   interval = "10s"
   collection_jitter = "0s"
   collection_offset = "0s"
   commands = [
      "/etc/telegraf/scripts/write-date"
   ]
   timeout = "5s"
   data_format = "influx"
[[outputs.file]]
  files = ["stdout", "/tmp/metrics.out"]
  1. working example with config like in 1) collection_jitter = "10s" removed from config
  2. date showing script (just to have time in output directly)
#!/bin/bash
echo "test-output,date=$(date -Is|sed 's/[+].*//') value=1"

Logs from Telegraf

  1. run with config of 1):
# /usr/bin/telegraf  -config /etc/telegraf/telegraf-test.conf 
2024-04-25T07:42:29Z I! Loading config: /etc/telegraf/telegraf-test.conf
2024-04-25T07:42:29Z I! Starting Telegraf 1.30.1 brought to you by InfluxData the makers of InfluxDB
2024-04-25T07:42:29Z I! Available plugins: 233 inputs, 9 aggregators, 31 processors, 24 parsers, 60 outputs, 6 secret-stores
2024-04-25T07:42:29Z I! Loaded inputs: exec
2024-04-25T07:42:29Z I! Loaded aggregators: 
2024-04-25T07:42:29Z I! Loaded processors: 
2024-04-25T07:42:29Z I! Loaded secretstores: 
2024-04-25T07:42:29Z I! Loaded outputs: file
2024-04-25T07:42:29Z I! Tags enabled: 
2024-04-25T07:42:29Z I! [agent] Config: Interval:1m0s, Quiet:false, Hostname:"", Flush Interval:10s
test-output,date=2024-04-25T09:42:35 value=1 1714030956000000000
test-output,date=2024-04-25T09:42:42 value=1 1714030963000000000
test-output,date=2024-04-25T09:42:54 value=1 1714030974000000000
test-output,date=2024-04-25T09:43:09 value=1 1714030989000000000
test-output,date=2024-04-25T09:43:11 value=1 1714030991000000000
test-output,date=2024-04-25T09:43:23 value=1 1714031003000000000
^C2024-04-25T07:43:31Z I! [agent] Hang on, flushing any cached metrics before shutdown
2024-04-25T07:43:31Z I! [agent] Stopping running outputs
  1. run with agent's collection jitter removed
# /usr/bin/telegraf  -config /etc/telegraf/telegraf-test.conf 
2024-04-25T07:47:32Z I! Loading config: /etc/telegraf/telegraf-test.conf
2024-04-25T07:47:32Z I! Starting Telegraf 1.30.1 brought to you by InfluxData the makers of InfluxDB
2024-04-25T07:47:32Z I! Available plugins: 233 inputs, 9 aggregators, 31 processors, 24 parsers, 60 outputs, 6 secret-stores
2024-04-25T07:47:32Z I! Loaded inputs: exec
2024-04-25T07:47:32Z I! Loaded aggregators: 
2024-04-25T07:47:32Z I! Loaded processors: 
2024-04-25T07:47:32Z I! Loaded secretstores: 
2024-04-25T07:47:32Z I! Loaded outputs: file
2024-04-25T07:47:32Z I! Tags enabled: 
2024-04-25T07:47:32Z I! [agent] Config: Interval:1m0s, Quiet:false, Hostname:"", Flush Interval:10s
test-output,date=2024-04-25T09:47:40 value=1 1714031260000000000
test-output,date=2024-04-25T09:47:50 value=1 1714031270000000000
test-output,date=2024-04-25T09:48:00 value=1 1714031280000000000
test-output,date=2024-04-25T09:48:10 value=1 1714031290000000000
test-output,date=2024-04-25T09:48:20 value=1 1714031300000000000

System info

Telegraf 1.30.1

Docker

No response

Steps to reproduce

  1. use config 1) from above, which uses collection_jittter in [agent] configuration and in a plugin (here exec plugin) configuration

...

Expected behavior

https://github.com/influxdata/telegraf/blob/master/docs/CONFIGURATION.md#intervals

collection_jitter: Overrides the collection_jitter setting of the [agent] (https://github.com/influxdata/telegraf/blob/master/docs/CONFIGURATION.md#agent) for the plugin. Collection jitter is used to jitter the collection by a random [interval]

Actual behavior

when [agent] configures jitter 10s and plugin configures jitter 0s, we see jitter in test output (see output "logs" section)

Additional info

No response

iridos avatar Apr 25 '24 08:04 iridos

@iridos actually, the plugin's option has precedence _if and only if it is not zero. :-( So setting the jitter to e.g. one nanosecond will override the agent's setting. This, or defining the jitter per plugin keeping the agent's setting zero, is the only workaround I can offer.

Solving this issue would involving a massive restructuring of the configuration handling as we currently instantiate plugins as we load the configs and we cannot guarantee the agent section to be loaded first, so we cannot complete the plugin's configuration (including agent-level settings) on instantiation...

srebhan avatar Apr 25 '24 08:04 srebhan

Oh, ok. This might be something to put into the documentation, though.

iridos avatar Apr 25 '24 09:04 iridos

I do agree. Would you be willing to submit a PR?

srebhan avatar Apr 25 '24 11:04 srebhan

mh, I can try. But I am not too familiar with git and nor with the documentation structure. I have taken CONFIGURATION.md to be the final authorative source of documentation, not sure what else there is.

iridos avatar Apr 26 '24 14:04 iridos

I have taken CONFIGURATION.md to be the final authorative source of documentation, not sure what else there is.

That would be the primary place to update. Thanks!

powersj avatar Apr 26 '24 14:04 powersj