telegraf icon indicating copy to clipboard operation
telegraf copied to clipboard

[inputs.docker] error message about tag.com.docker.swarm.task

Open optica-phoffmann opened this issue 4 months ago • 0 comments

Relevant telegraf.conf


[global_tags]

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "60s"
  flush_jitter = "20s"
  precision = "0s"
  hostname = ""
  omit_hostname = false

[[inputs.docker]]
  endpoint = "tcp://<localhost>:2376"
  gather_services = true
  container_names = []
  source_tag = true
  container_name_include = []
  container_name_exclude = []
  storage_objects = []
  timeout = "5s"
  perdevice = false
  total = true
  total_include = ["cpu", "blkio", "network"]
  docker_label_include = []
  docker_label_exclude = []



[[outputs.opensearch]]
   namedrop = [ "prometheus_remote_write" ]
   fieldexclude = [ "*inodes*",]
   urls = [ "http://opensearch-server:9200" ]
   index_name = 'telegraf_server_local_{{.Time.Format "2006-01" }}'
   timeout = "5s"
   enable_sniffer = false
   health_check_interval = "10s"
   manage_template = true
   template_name = "telegraf"
   overwrite_template = false

Logs from Telegraf

2024-10-28T10:55:28.221477+01:00 server telegraf[2619349]: 2024-10-28T09:55:28Z E! [outputs.opensearch] error while OpenSearch bulkIndexing: illegal_
argument_exception: can't merge a non object mapping [tag.com.docker.swarm.task] with an object mapping

System info

Alma9.3, telegraf 1.29.1, Opensearch 2.11.1, Docker 23.0.6

Docker

No response

Steps to reproduce

  1. Start telegraf with config on docker swarm server
  2. look into logs and find error messages

Error Message:

2024-10-28T10:55:28.221409+01:00 server telegraf[2619349]: 2024-10-28T09:55:28Z E! [outputs.opensearch] error while OpenSearch bulkIndexing: illegal_argument_exception: can't merge a non object mapping [tag.com.docker.swarm.task] with an object mapping

Expected behavior

I don't expect to see opensearch error messages when using telegraf plugin.

Actual behavior

I see error messages.

Additional info

I googled for this error quite some time but couldn't find any useful information. I know how to workaround this problem: I set docker_label_exclude = ["com.docker.swarm.task"].

I did a packet trace and identified the following message:

"docker_container_status":
    {"container_id":"5d99f75d3248bcb411a230093358968ba15ae410260a78782addfa51570ef96f",
    "exitcode":0,
    "oomkilled":false,
    "pid":930631,
    "restart_count":0,
    "started_at":1728915137827671480,
    "uptime_ns":1193454255557255},
    "measurement_name":"docker_container_status",
    "tag":{
        "com.docker.swarm.node.id":"m8dspac453l7dhhvmjuc9tiu9",
        "com.docker.swarm.service.id":"5ogdh46b9wc3vg77jdlglwmbc",
        "com.docker.swarm.service.name":"docker-service",
        "com.docker.swarm.task":"",
        "com.docker.swarm.task.id":"qrh0kclq4fp1gcxvuj065v027",
        "com.docker.swarm.task.name":"docker-service.1.qrh0kclq4fp1gcxvuj065v027",
        "container_image":"dockerhub/docker-service:1.0.0@sha256",
        "container_name":"docker-service.1.qrh0kclq4fp1gcxvuj065v027",
        "container_status":"running",
        "container_version":"3397a5c487d758eba9206766d6878660c15f70775d16672326eedead0dfe9cba",
        "engine_host":"<server-name>",
        "host":"<server-name>",
        "proxy":"true",
        "server_version":"23.0.6"
    }
}

From my understanding, directly assigning a value like this "com.docker.swarm.task": "" and creating sub-elements afterwards with "com.docker.swarm.task.id":"<omitted>" is responsible for this error. I don't know if this should be fixed on the plugin side or on the opensearch side.

optica-phoffmann avatar Oct 28 '24 12:10 optica-phoffmann