telegraf icon indicating copy to clipboard operation
telegraf copied to clipboard

Increased use of CPU time in version 1.25

Open VladislavGatsenko opened this issue 2 years ago • 4 comments

Relevant telegraf.conf

# Configuration for telegraf agent
[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 500
  metric_buffer_limit = 5000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  hostname = ""
  omit_hostname = false

[[outputs.influxdb_v2]]
  urls = ["http://127.0.0.1:8086"]
  token = "IGb2BV7AgMNeRGpH0Vwyjil3yFEXNS7Y5vJbMacoGRDcP1J8B32YJXLQb16l_IKMPZJQ_vlK48olJxVRhirl1A=="
  organization = "IoT"
  bucket = "IoT"

[[inputs.mqtt_consumer]]
  servers = ["tcp://127.0.0.1:1883"]
  topics = ["#"]
  username = "IoT"
  password = "student"
  data_format = "value"
  data_type = "float"

Logs from Telegraf

root@debian:~# telegraf --debug
2022-12-20T07:23:45Z I! Using config file: /etc/telegraf/telegraf.conf
2022-12-20T07:23:45Z I! Starting Telegraf 1.25.0
2022-12-20T07:23:45Z I! Available plugins: 228 inputs, 9 aggregators, 26 processors, 21 parsers, 57 outputs, 2 secret-stores
2022-12-20T07:23:45Z I! Loaded inputs: mqtt_consumer (1x)
2022-12-20T07:23:45Z I! Loaded aggregators: 
2022-12-20T07:23:45Z I! Loaded processors: 
2022-12-20T07:23:45Z I! Loaded secretstores: 
2022-12-20T07:23:45Z I! Loaded outputs: influxdb_v2
2022-12-20T07:23:45Z I! Tags enabled: host=debian
2022-12-20T07:23:45Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"debian", Flush Interval:10s
2022-12-20T07:23:45Z D! [agent] Initializing plugins
2022-12-20T07:23:45Z D! [agent] Connecting outputs
2022-12-20T07:23:45Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2022-12-20T07:23:45Z D! [agent] Successfully connected to outputs.influxdb_v2
2022-12-20T07:23:45Z D! [agent] Starting service inputs
2022-12-20T07:23:45Z I! [inputs.mqtt_consumer] Connected [tcp://127.0.0.1:1883]
2022-12-20T07:23:56Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 5000 metrics
2022-12-20T07:24:06Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 5000 metrics
2022-12-20T07:24:16Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 5000 metrics
2022-12-20T07:24:26Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 5000 metrics

System info

Telegraf 1.25, Debian 11.5

Docker

Steps to reproduce

In Additional info

Expected behavior

In Additional info

Actual behavior

In Additional info

Additional info

Telegraf 1.25 uses more than twice as much CPU time.

For example, Telegraf 1.24.4 consumed about 1 hour of CPU time per day. The latest 1.25 version consumes about 2.5 hours of CPU time for the same settings!

The system is Debian 11.5 with the latest updates. The configuration uses ONE connection to a local mqtt broker.

Something went wrong...

VladislavGatsenko avatar Dec 20 '22 07:12 VladislavGatsenko

twice as much CPU time.

How are you measuring this?

powersj avatar Jan 03 '23 21:01 powersj

I measure with the htop utility, in the Time+ column

VladislavGatsenko avatar Jan 03 '23 21:01 VladislavGatsenko

I measure with the htop utility, in the Time+ column

ok that field specifies how much time the process has used.

This could be entirely dependent on how many messages you are passing and needing to process via MQTT, how many connection retries attempts are required to your input, if any retries or issues came up requiring additional work with the influxdb outputs, etc.

The changes between v1.24.4 and v1.25.0 includes:

  • To the mqtt_consumer plugin there was https://github.com/influxdata/telegraf/pull/10696 which reworked some of the message tracking and connecting.
  • Updating the version of Go to v1.19.4 from v1.19.1.
  • I don't believe there were any changes to the influxdb output.

There is nothing of concern or any red flags in any of the above. If you want to spend the time to profile the binaries and then compare the two check out the profiling readme, but based on this report, as-is, there is no further actions.

powersj avatar Jan 04 '23 17:01 powersj

The MQTT configs are exactly the same for each version (when I do the measurement) I don’t send any messages to the MQTT topics during the CPU time measurement. It just executes the config which is specified at the beginning.

I know what CPU Time means and it was very strange to me that the use case is the same and the different versions have twice as much activity in the telegraf process.

If that's the way it's supposed to be, okay. Unfortunately, I don't have the knowledge to dig any deeper.

VladislavGatsenko avatar Jan 04 '23 21:01 VladislavGatsenko

Hello! I am closing this issue due to inactivity. I hope you were able to resolve your problem, if not please try posting this question in our Community Slack or Community Page. Thank you!

telegraf-tiger[bot] avatar Jan 19 '23 18:01 telegraf-tiger[bot]