telegraf icon indicating copy to clipboard operation
telegraf copied to clipboard

[inputs.docker] error message about oomkilled

Open optica-phoffmann opened this issue 4 months ago • 0 comments

Relevant telegraf.conf

[global_tags]

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "60s"
  flush_jitter = "20s"
  precision = "0s"
  hostname = ""
  omit_hostname = false

[[inputs.docker]]
  endpoint = "tcp://<localhost>:2376"
  gather_services = true
  container_names = []
  source_tag = true
  container_name_include = []
  container_name_exclude = []
  storage_objects = []
  timeout = "5s"
  perdevice = false
  total = true
  total_include = ["cpu", "blkio", "network"]
  docker_label_include = []
  docker_label_exclude = []



[[outputs.opensearch]]
   namedrop = [ "prometheus_remote_write" ]
   fieldexclude = [ "*inodes*",]
   urls = [ "http://opensearch-server:9200" ]
   index_name = 'telegraf_server_local_{{.Time.Format "2006-01" }}'
   timeout = "5s"
   enable_sniffer = false
   health_check_interval = "10s"
   manage_template = true
   template_name = "telegraf"
   overwrite_template = false

Logs from Telegraf

2024-10-28T10:55:28.297771+01:00 server telegraf[2619349]: 2024-10-28T09:55:28Z E! [outputs.opensearch] error while OpenSearch bulkIndexing: mapper_parsing_exception: unknown parameter [norms] on mapper [oomkilled] of type [null]

System info

Alma9.3, telegraf 1.29.1, Opensearch 2.11.1, Docker 23.0.6

Docker

No response

Steps to reproduce

  1. Start telegraf with config on docker swarm server
  2. look into logs and find error messages

Expected behavior

I don't expect to see opensearch error messages when using telegraf plugin.

Actual behavior

I see error messages.

Additional info

I googled for this error quite some time but couldn't find any useful information. I know how to workaround this problem: I set

[[outputs.opensearch]]
   fieldexclude = [ "oomkilled"]

I did a packet trace and identified the following message:

"docker_container_status":
    {"container_id":"5d99f75d3248bcb411a230093358968ba15ae410260a78782addfa51570ef96f",
    "exitcode":0,
    "oomkilled":false,
    "pid":930631,
    "restart_count":0,
    "started_at":1728915137827671480,
    "uptime_ns":1193454255557255},
    "measurement_name":"docker_container_status",
    "tag":{
        "com.docker.swarm.node.id":"m8dspac453l7dhhvmjuc9tiu9",
        "com.docker.swarm.service.id":"5ogdh46b9wc3vg77jdlglwmbc",
        "com.docker.swarm.service.name":"docker-service",
        "com.docker.swarm.task":"",
        "com.docker.swarm.task.id":"qrh0kclq4fp1gcxvuj065v027",
        "com.docker.swarm.task.name":"docker-service.1.qrh0kclq4fp1gcxvuj065v027",
        "container_image":"dockerhub/docker-service:1.0.0@sha256",
        "container_name":"docker-service.1.qrh0kclq4fp1gcxvuj065v027",
        "container_status":"running",
        "container_version":"3397a5c487d758eba9206766d6878660c15f70775d16672326eedead0dfe9cba",
        "engine_host":"<server-name>",
        "host":"<server-name>",
        "proxy":"true",
        "server_version":"23.0.6"
    }
}

However, oomkilled is not null. I found out that norms is a setting of the opensearch index template. As you can see the opensearch index template is managed via telegraf.

optica-phoffmann avatar Oct 28 '24 12:10 optica-phoffmann