telegraf icon indicating copy to clipboard operation
telegraf copied to clipboard

Missing some information when using OpenTelemetry Output Plugin

Open haihh05 opened this issue 1 year ago • 12 comments

Relevant telegraf.conf

#telegraf1.conf
[[inputs.jti_openconfig_telemetry]]
  servers = ["my_network_device:4317"]

  sample_frequency = "10000ms"

  sensors = [
    "2000ms intefaces /interfaces",
  ]

[[outputs.opentelemetry]]
  service_address = "0.0.0.0:30003"

[[outputs.file]]
  files = ["stdout"]

  data_format = "influx"
#telegraf2.conf

[[inputs.opentelemetry]]
  service_address = "0.0.0.0:30003"

[[outputs.file]]
  files = ["stdout"]

  data_format = "influx"

Logs from Telegraf

telegraf1  | intefaces,/interfaces/interface/@name=ae0,device=...,host=...,path=sensor_1004_5_1:/interfaces/:/interfaces/:mib2d,system_id=... _component_id=65535i,/interfaces/interface/state/type="other",/interfaces/interface/state/description="...",/interfaces/interface/state/oper-status="UP",/interfaces/interface/ethernet/state/enable-flow-control=false,/interfaces/interface/state/name="ae0",_timestamp=1687351346281i,/interfaces/interface/state/admin-status="UP",/interfaces/interface/ethernet/state/mac-address="...",/interfaces/interface/ethernet/state/negotiated-port-speed="SPEED_10GB",_subcomponent_id=0i,/interfaces/interface/state/enabled=true,/interfaces/interface/state/last-change=1687025787040761000i,/interfaces/interface/hold-time/state/up=0i,/interfaces/interface/hold-time/state/down=0i,/interfaces/interface/ethernet/state/port-speed="SPEED_10GB",/interfaces/interface/ethernet/state/hw-mac-address="...",_sequence=292i,/interfaces/interface/state/ifindex=517i,/interfaces/interface/state/logical=false,/interfaces/interface/state/mtu=9192i,/interfaces/interface/state/loopback-mode=false 1687322708946903506

telegraf2  | intefaces_/interfaces/interface/state/ifindex,/interfaces/interface/@name=ae0,device=...,host=...,path=sensor_1004_5_1:/interfaces/:/interfaces/:mib2d,system_id=... gauge=517i 168732270894690350

System info

Telegraf 1.26.3

Docker

#docker-compose.yaml
version: '3.3'

services:
  telegraf1:
    image: telegraf:1.26.3
    container_name: telegraf1
    network_mode: host
    restart: unless-stopped
    volumes:
      - ./telegraf1.conf:/etc/telegraf/telegraf.conf

  telegraf2:
    image: telegraf:1.26.3
    container_name: telegraf2
    network_mode: host
    restart: unless-stopped
    volumes:
      - ./telegraf2.conf:/etc/telegraf/telegraf.conf

Steps to reproduce

  1. Run docker-compose file with telegraf1.conf and telegraf2.config : docker compose up -d
  2. Check logs and compare (results as logs above)

Expected behavior

Logs of telegraf2 are the same logs of telegraf1

Actual behavior

Logs of telegraf2 are not the same logs of telegraf1

Additional info

No response

haihh05 avatar Jun 21 '23 04:06 haihh05

Hi,

There is not enough information in this issue.

Your title implies that the OpenTelemetry output is somehow missing information, except the only output you provided is from the the logs, the file output.

You input is the jti_openconfig_telemetry which clearly states the following:

Client ID must be unique when connecting from multiple instances
of telegraf to the same device

You are trying to use the exact same config and hence have the same client ID in both and as a result you are not collecting unique data between both instances.

Please fix your config to have unique client IDs. Closing as not planned.

powersj avatar Jun 21 '23 12:06 powersj

Hi @powersj

Hi,

There is not enough information in this issue.

Your title implies that the OpenTelemetry output is somehow missing information, except the only output you provided is from the the logs, the file output.

You input is the jti_openconfig_telemetry which clearly states the following:

Client ID must be unique when connecting from multiple instances
of telegraf to the same device

You are trying to use the exact same config and hence have the same client ID in both and as a result you are not collecting unique data between both instances.

Please fix your config to have unique client IDs. Closing as not planned.

To make it clear, my diagram is as below:

diagram

There is one input flow to telegraf1, so it can not have the same client ID.

Why not log1 and log2 are the same?

Even if I add it to my config, it still misses the information

  username = "my_user"
  password = "my_pass"
  client_id = "telegraf"

haihh05 avatar Jun 22 '23 04:06 haihh05

There is one input flow to telegraf1, so it can not have the same client ID.

That was not clear from your initial report. Thanks - your edits have made it clearer. Let me ask about this one internally.

powersj avatar Jun 22 '23 13:06 powersj

@haihh05 it would help me reproduce the error if the example inputs/outputs were smaller.

From the provided line protocol, I can see that the measurement name isn't even the same: intefaces vs intefaces_/interfaces/interface/state/ifindex

This is suspicious.

jacobmarble avatar Jun 22 '23 16:06 jacobmarble

@haihh05 your reproduce steps do not yield any logs because nothing sends input to this plugin: [[inputs.jti_openconfig_telemetry]]

jacobmarble avatar Jun 22 '23 16:06 jacobmarble

Also, I would like to emphasize that, while I'm interested to learn and share the cause of this behavior, complete round-trip fidelity is not a goal of the related modules.

jacobmarble avatar Jun 22 '23 16:06 jacobmarble

So, what can I do next? I can provide more details on what you need.

haihh05 avatar Jun 23 '23 03:06 haihh05

From the provided line protocol, I can see that the measurement name isn't even the same: intefaces vs intefaces_/interfaces/interface/state/ifindex

This is suspicious.

You can find this key/value in log1: "/interfaces/interface/state/ifindex=517i", and log2 is "intefaces_/interfaces/interface/state/ifindex" with "gauge=517i"

I can confirm two logs is from the same metric because I can not see other logs about it having more details on interface ae0 than logs of telegraf1 in the logs of telegraf2.

haihh05 avatar Jun 23 '23 03:06 haihh05

So, what can I do next? I can provide more details on what you need.

@haihh05 your reproduce steps do not yield any logs because nothing sends input to this plugin: [[inputs.jti_openconfig_telemetry]]

I need input to this plugin. Without it, I cannot generate log1 and log2 for myself.

jacobmarble avatar Jun 23 '23 22:06 jacobmarble

I need input to this plugin. Without it, I cannot generate log1 and log2 for myself.

The input is from my network device: Model: qfx5120-48y-8c Junos: 20.4R3.8 My config command is:

set system services extension-service request-response grpc clear-text port 4317
set system services extension-service request-response grpc skip-authentication
set system services extension-service notification allow-clients address <telegraf-server>/32

You can see details of metrics at this

haihh05 avatar Jun 24 '23 14:06 haihh05

Model: qfx5120-48y-8c

I do not necessarily expect @jacobmarble to go out and find a $15k switch ;)

I wanted to try this out myself and converted your line protocol above into JSON so I could read it from a file. Here is the JSON version of your metric above:

{
    "name": "intefaces",
    "timestamp": 1687322708946903600,
    "tags": {
      "/interfaces/interface/@name": "ae0",
      "device": "...",
      "host": "...",
      "path": "sensor_1004_5_1:/interfaces/:/interfaces/:mib2d",
      "system_id": "..."
    },
    "fields": {
      "_component_id": 65535,
      "/interfaces/interface/state/type": "other",
      "/interfaces/interface/state/description": "...",
      "/interfaces/interface/state/oper-status": "UP",
      "/interfaces/interface/ethernet/state/enable-flow-control": "false",
      "/interfaces/interface/state/name": "ae0",
      "_timestamp": 1687351346281,
      "/interfaces/interface/state/admin-status": "UP",
      "/interfaces/interface/ethernet/state/mac-address": "...",
      "/interfaces/interface/ethernet/state/negotiated-port-speed": "SPEED_10GB",
      "_subcomponent_id": 0,
      "/interfaces/interface/state/enabled": "true",
      "/interfaces/interface/state/last-change": 1687025787040761000,
      "/interfaces/interface/hold-time/state/up": 0,
      "/interfaces/interface/hold-time/state/down": 0,
      "/interfaces/interface/ethernet/state/port-speed": "SPEED_10GB",
      "/interfaces/interface/ethernet/state/hw-mac-address": "...",
      "_sequence": 292,
      "/interfaces/interface/state/ifindex": 517,
      "/interfaces/interface/state/logical": "false",
      "/interfaces/interface/state/mtu": 9192,
      "/interfaces/interface/state/loopback-mode": "false"
    }
}

Then I used the following config to read the metric and send it using the file and opentelemetry outputs:

[agent]
    debug = true

[[inputs.file]]
    files = ["data.json"]
    data_format = "xpath_json"

    [[inputs.file.xpath]]
        metric_name = "/name"
        timestamp = "/timestamp"
        timestamp_format = "unix"
        field_selection = "fields/*"
        tag_selection = "tags/*"

[[outputs.file]]

[[outputs.opentelemetry]]
    service_address = "0.0.0.0:30003"

However, I get many of the following messages:

2023-06-26T13:47:26Z D! [outputs.opentelemetry] field has unsupported type measurement="intefaces" field="/interfaces/interface/ethernet/state/enable-flow-control" type="string"
2023-06-26T13:47:26Z D! [outputs.opentelemetry] field has unsupported type measurement="intefaces" field="/interfaces/interface/state/admin-status" type="string"

@jacobmarble - this would seem to explain why the receiver only reported the numeric metric and all the string fields were ignored. Does that seem like the right take away?

I did also try metrics with a few numeric fields and they worked as expected:

foobar,host=ryzen,source=localhost value=42,other_value=44 1687786520000000000

Was sent and correctly reported as:

foobar_value,host=ryzen,source=localhost gauge=42 1687786510000000000
foobar_other_value,host=ryzen,source=localhost gauge=44 1687786510000000000

powersj avatar Jun 26 '23 13:06 powersj

Hi @powersj I am struggling with getting strings from snmp_trap into opentelemetry as well. Did you ever find a solution?

NielsMikuta avatar Dec 19 '23 05:12 NielsMikuta