telegraf
telegraf copied to clipboard
Increased use of CPU time in version 1.25
Relevant telegraf.conf
# Configuration for telegraf agent
[agent]
interval = "10s"
round_interval = true
metric_batch_size = 500
metric_buffer_limit = 5000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
hostname = ""
omit_hostname = false
[[outputs.influxdb_v2]]
urls = ["http://127.0.0.1:8086"]
token = "IGb2BV7AgMNeRGpH0Vwyjil3yFEXNS7Y5vJbMacoGRDcP1J8B32YJXLQb16l_IKMPZJQ_vlK48olJxVRhirl1A=="
organization = "IoT"
bucket = "IoT"
[[inputs.mqtt_consumer]]
servers = ["tcp://127.0.0.1:1883"]
topics = ["#"]
username = "IoT"
password = "student"
data_format = "value"
data_type = "float"
Logs from Telegraf
root@debian:~# telegraf --debug
2022-12-20T07:23:45Z I! Using config file: /etc/telegraf/telegraf.conf
2022-12-20T07:23:45Z I! Starting Telegraf 1.25.0
2022-12-20T07:23:45Z I! Available plugins: 228 inputs, 9 aggregators, 26 processors, 21 parsers, 57 outputs, 2 secret-stores
2022-12-20T07:23:45Z I! Loaded inputs: mqtt_consumer (1x)
2022-12-20T07:23:45Z I! Loaded aggregators:
2022-12-20T07:23:45Z I! Loaded processors:
2022-12-20T07:23:45Z I! Loaded secretstores:
2022-12-20T07:23:45Z I! Loaded outputs: influxdb_v2
2022-12-20T07:23:45Z I! Tags enabled: host=debian
2022-12-20T07:23:45Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"debian", Flush Interval:10s
2022-12-20T07:23:45Z D! [agent] Initializing plugins
2022-12-20T07:23:45Z D! [agent] Connecting outputs
2022-12-20T07:23:45Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2022-12-20T07:23:45Z D! [agent] Successfully connected to outputs.influxdb_v2
2022-12-20T07:23:45Z D! [agent] Starting service inputs
2022-12-20T07:23:45Z I! [inputs.mqtt_consumer] Connected [tcp://127.0.0.1:1883]
2022-12-20T07:23:56Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 5000 metrics
2022-12-20T07:24:06Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 5000 metrics
2022-12-20T07:24:16Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 5000 metrics
2022-12-20T07:24:26Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 5000 metrics
System info
Telegraf 1.25, Debian 11.5
Docker
Steps to reproduce
In Additional info
Expected behavior
In Additional info
Actual behavior
In Additional info
Additional info
Telegraf 1.25 uses more than twice as much CPU time.
For example, Telegraf 1.24.4 consumed about 1 hour of CPU time per day. The latest 1.25 version consumes about 2.5 hours of CPU time for the same settings!
The system is Debian 11.5 with the latest updates. The configuration uses ONE connection to a local mqtt broker.
Something went wrong...
twice as much CPU time.
How are you measuring this?
I measure with the htop utility, in the Time+ column
I measure with the htop utility, in the Time+ column
ok that field specifies how much time the process has used.
This could be entirely dependent on how many messages you are passing and needing to process via MQTT, how many connection retries attempts are required to your input, if any retries or issues came up requiring additional work with the influxdb outputs, etc.
The changes between v1.24.4 and v1.25.0 includes:
- To the mqtt_consumer plugin there was https://github.com/influxdata/telegraf/pull/10696 which reworked some of the message tracking and connecting.
- Updating the version of Go to v1.19.4 from v1.19.1.
- I don't believe there were any changes to the influxdb output.
There is nothing of concern or any red flags in any of the above. If you want to spend the time to profile the binaries and then compare the two check out the profiling readme, but based on this report, as-is, there is no further actions.
The MQTT configs are exactly the same for each version (when I do the measurement) I don’t send any messages to the MQTT topics during the CPU time measurement. It just executes the config which is specified at the beginning.
I know what CPU Time means and it was very strange to me that the use case is the same and the different versions have twice as much activity in the telegraf process.
If that's the way it's supposed to be, okay. Unfortunately, I don't have the knowledge to dig any deeper.
Hello! I am closing this issue due to inactivity. I hope you were able to resolve your problem, if not please try posting this question in our Community Slack or Community Page. Thank you!