telegraf
telegraf copied to clipboard
telegraf wrongly assumes metric is prometheus histogram and breaks itself
Relevant telegraf.conf
[global_tags]
[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = "0s"
hostname = ""
omit_hostname = false
[[inputs.influxdb_listener]]
service_address = "127.0.0.1:8086"
[[outputs.opentelemetry]]
Logs from Telegraf
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
Nov 30 15:07:38 telegraf[13175]: 2023-11-30T20:07:38Z W! [outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64
System info
telegraf 1.28.5
Docker
No response
Steps to reproduce
- Send some influxdb metrics that include a
count
field ending ini
, in examplecount=1i
- Attempt to export via outputs.opentelemetry
- Observe telegraf complain about the incorrect assumption it made with the warning message: "[outputs.opentelemetry] Failed to add point: unsupported histogram count value type int64"
Expected behavior
The metric should be converted to a format that is compatible with opentelemetry's specification and sent to the specified output.
Actual behavior
telegraf freaks out about a type assumption that it made on it's own and refuses to output via opentelemetry
Additional info
I think the main problem is that the logic for assuming a prometheus histogram is faulty. This also came up in: https://github.com/influxdata/telegraf/pull/12431
I would suggest either:
The logic shouldn't be assuming it's a prometheus histogram when the count
field is of type int64
.
The value should be converted to a supported type automatically
It should be possible in the configuration file to explicitly define the conversion for values of incompatible types
Hi,
Thanks for the report.
telegraf freaks out about a type assumption that it made on it's own and refuses to output via opentelemetry
The input, influxdb listener, does not make any assumptions nor does it set any types on metrics. InfluxDB does not have the concept of types, so everything gets recorded as "untyped" in Telegraf, not "histogram".
That error message is coming from a call to "AddPoint", which is from the influx2otel library. I believe the error returned here in convertHistogramV1
. This is only called if the determined type is histogram, which would mean that a) the type was not untyped, cleary wrong, and also ensuring that both the count and sum fields exist, which I am guessing is also not the case.
I think that is actually happening here is our iota's are not lined up between telegraf and the otel library. It looks like Telegraf's Untyped, 2, is the otel library's Histogram.
@jacobmarble is this something you could please chime in on from the otel library perspective?
Thanks!