beats
beats copied to clipboard
Implement default fallback option when using templates in autodiscover
When defining templates in autodiscover, it would be nice to have a default fallback to use when none of them matches, something like this:
filebeat.autodiscover:
providers:
- type: docker
templates:
- condition:
contains:
docker.container.name: "nginx"
config:
- module: nginx
access:
prospector:
type: docker
containers.stream: stdout
container.ids:
- "${data.docker.container.id}"
# Fallback to docker prospector (without parsing) for unknown containers:
default:
- type: docker
container.ids:
- "${data.docker.container.id}"
👍 on this suggestion. An other option for the config would be to have it as following:
filebeat.autodiscover:
providers:
- type: docker
templates:
- default: true
config:
container.ids:
- "${data.docker.container.id}"
- condition:
contains:
docker.container.name: "nginx"
config:
- module: nginx
access:
prospector:
type: docker
containers.stream: stdout
container.ids:
- "${data.docker.container.id}"
- I think there should be 1 default per type defined. This allows different defaults for each type and if type:docker is defined multiple times, it would allow one default for each. I think this could allow interesting combinations. I wonder if the indentation in your example above should have been to spaces to the left to be on the same level as providers?
- The option above would allow to have 1 format for all config options and not have it in 2 places. I'm hoping that simplifies the code.
- In the above case, there is no "global" fallback if someone uses multiple providers. Do we need that?
Hi!
How would a condition.contains[] is met today?
If it mets only when all the sub-conditions in it turned true i.e. true && met(contains[0]) && met(contains[1]) && ... met(contains[n-1]), I'd be happy w/ multiple levels of defaultings to suffice my needs. For example, I'd want to write:
filebeat.autodiscover:
providers:
- type: docker
templates:
- # This specific type of containers/pods in this specific namespace has its own log format
condition:
contains:
kubernetes.pod.name: "mixer"
kubernets.namespace.name: "istio-system"
config:
- module: istio-mixer
log:
prospector:
type: docker
containers.stream: stdout
container.ids:
- "${data.docker.container.id}"
- # This is the default for the "istio-system" namespace.
# "Almost" all the containers/pods in this specific namespace would have an uniform log format
condition:
contains:
kubernets.namespace: "istio-system"
config:
- module: istio
log:
prospector:
type: docker
containers.stream: stdout
container.ids:
- "${data.docker.container.id}"
- # This is the default for our modern apps of `type: docker`. Assume it emits structured logs as the pod is annotated with its modernity.
condition:
anyof:
kuberntes.pod.annotations:
contains: "i-am-modern-ndjson-logging-app"
config:
- module: ndjson-logging-app
log:
prospector:
type: docker
containers.stream: stdout
container.ids:
- "${data.docker.container.id}"
- # This is the global default per `type: docker`. We assume it emits non-structure logs
config:
- module: mylegacyapp
log:
prospector:
type: docker
containers.stream: stdout
container.ids:
- "${data.docker.container.id}"
@mumoshu Have been trying to find documentation for the conditionals. Do all of these work?
Hi @rossedman, the configuration seen in my above comment is just a suggestion! It isn't implemented as of today.
Does it look good to you? I had been willing to contribute a PR once I get some approval and/or support on the suggestion but was unable to do so due to.. the silence 😉
Sorry for the late response @mumoshu, I missed your question while going over email :innocent:
The answer is yes, contains match all the fields in the given map :)
@rossedman you can find more info about conditions here: https://www.elastic.co/guide/en/beats/metricbeat/6.2/defining-processors.html#conditions
Hi @exekias
i'm trying to implement OR logic using multiple condition statements in filebeat.yml
but it doesn't work.
- condition:
or:
- contains:
docker.container.name: "image1"
- contains:
docker.container.name: "image2"
Is there a way to achieve this without duplicating single condition?
- condition:
contains:
docker.container.name: "image1"
- condition:
contains:
docker.container.name: "image2"
@mvasilenko I tested this and it worked for me: https://gist.github.com/exekias/e802ef376fdbbd4ba5872b57af4128bf
You may want to review your config, "image1" is repeated
Good suggestion. Until something of this kind is implemented, as far as I understand, we are required to pervert like this:
filebeat.autodiscover:
providers:
- type: kubernetes
templates:
- condition:
or:
- contains:
kubernetes.pod.name: internal-api
- contains:
kubernetes.pod.name: customer-api
- contains:
kubernetes.pod.name: exporter
config:
- type: docker
containers.ids:
- "${data.kubernetes.container.id}"
multiline.pattern: '^[[:space:]]+((at|\.{3})\b|^Caused by:)?'
multiline.negate: false
multiline.match: after
- condition:
and:
- not:
contains:
kubernetes.pod.name: internal-api
- not:
contains:
kubernetes.pod.name: customer-api
- not:
contains:
kubernetes.pod.name: exporter
config:
- type: docker
containers.ids:
- "${data.kubernetes.container.id}"
https://github.com/elastic/beats/pull/9029 was just merged, which brings the ability to define a configuration without conditions. Conditions are matched in order, so if you put this one at the end it will act as the default, as I think it solves this issue, I'm proceeding to close it.
This is an example config that will be possible with this change:
filebeat.autodiscover:
providers:
- type: kubernetes
templates:
- condition:
or:
- contains:
kubernetes.pod.name: internal-api
- contains:
kubernetes.pod.name: customer-api
- contains:
kubernetes.pod.name: exporter
config:
- type: docker
containers.ids:
- "${data.kubernetes.container.id}"
multiline.pattern: '^[[:space:]]+((at|\.{3})\b|^Caused by:)?'
multiline.negate: false
multiline.match: after
# Default:
- config:
- type: docker
containers.ids:
- "${data.kubernetes.container.id}"
I tried that configuration with docker containers only and it fails with parsing docker log files multiple times.
Consider the following config:
filebeat.autodiscover:
providers:
# Provider for our docker containers
- type: docker
templates:
# Template for the spring boot json logging containers
- condition:
contains:
docker.container.image: myuser/myimage
config:
- type: docker
containers:
ids:
- ${data.docker.container.id}
encoding: utf-8
json:
keys_under_root: true
add_error_key: true
message_key: "message"
overwrite_keys: true
match: after
fields:
log.format.content: "json"
log.format.layout: "spring-boot"
- condition:
config:
- type: docker
containers:
ids:
- ${data.docker.container.id}
encoding: utf-8
fields:
log.format.content: "plain"
log.format.layout: "spring"
When I check this configuration with filebeat 6.6.2, it tells me that the config is ok. When I start this configuration, my expected behavior is:
- The log from the container
myuser/myimageis using the template for json logging - All other containers are using the default template
What happens is:
- The log from the container
myuser/myimageis harvested using the given template - The log of all containers including
myuser/myimageis harvested using the default template.
Therefore the logs for the container using image myuser/myimage is harvested twice.
Is this the expected behavior? And if so, can the second log stream be suppressed in any way?
Same here. I don't think this is expected behavior.
I tried to use default condition also with filebeat 6.6.2. My config is as follow :
autodiscover:
providers:
- type: kubernetes
templates:
- condition:
equals:
kubernetes.labels.type: java
config:
- type: docker
containers.ids:
- "${data.kubernetes.container.id}"
multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2} '
multiline.negate: true
multiline.match: after
ignore_older: 48h
- condition:
contains:
kubernetes.labels.app: nginx
config:
- module: nginx
access:
input:
type: docker
containers.stream: stdout
containers.ids:
- "${data.kubernetes.container.id}"
ignore_older: 48h
error:
input:
type: docker
containers.stream: stderr
containers.ids:
- "${data.kubernetes.container.id}"
ignore_older: 48h
- config:
- type: docker
containers.ids:
- "${data.kubernetes.container.id}"
ignore_older: 48h
The result is that my logs are stored twice. For example, for java apps which have kubernetes.labels.type: java I have a multiline doc with parsed fields (loglevel, class, etc) AND a raw doc without these fields.
Simple example to reproduce : https://gist.github.com/olivierboudet/796a240577ea7f9fcb5a6f25a7114e6c In the configuration, I added a condition to ignore all logs from filebeat container. If I uncomment the last part (ie. the default config), all logs are processed, included those from filebeat container.
Any news on this? Does the unconditional config work or not?
Sorry for the late response. We released default_config setting for hints based autodiscover, check how it works here: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover-hints.html#_kubernetes_2
Check hints.default_config setting. I would say it fulfills what we are seeking with this issue
@exekias Using this currently:
filebeat.autodiscover:
providers:
- type: kubernetes
in_cluster: false
host: ${HOSTNAME}
kube_config: /home/svcaccount/.kube/config
hints_enabled: true
templates:
- condition:
contains:
kubernetes.container.image: redis
config:
- module: redis
log:
enabled: true
input:
type: container
paths:
- /var/lib/docker/containers/${data.kubernetes.container.id}/*.log
slowlog:
enabled: false
- config:
- type: container
paths: ["/var/lib/docker/containers/${data.kubernetes.container.id}/*.log"]
exclude_lines: ["^\\s+[\\-`('.|_]"] # drop asciiart lines
Which seems to work without using hints.default_config. I did not immediately found any duplicates. As I'm actually not really using hints, should I still put:
- config:
- type: container
paths: ["/var/lib/docker/containers/${data.kubernetes.container.id}/*.log"]
exclude_lines: ["^\\s+[\\-`('.|_]"] # drop asciiart lines
under hints.default_config: ?
That's interesting. Some users reported these settings were failing for them. Do you have any redis container running? We would need to check that it falls under the redis condition and the default settings are not launched for it
Sorry for the late response. We released
default_configsetting for hints based autodiscover, check how it works here: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover-hints.html#_kubernetes_2Check
hints.default_configsetting. I would say it fulfills what we are seeking with this issue
Would you please elaborate, I couldn't get it works :(
Using Filebeat v7.4 I'm getting duplicates with the config below. There is a message containing the decoded JSON fields and another that apparently used the default config and did not decode the JSON. It seems like the default config acts as a "finally" step and is applied whether or not any condition was true for a given container.
filebeat.autodiscover:
providers:
- type: docker
templates:
- condition:
equals:
container.image.name: "mycomponent"
config:
- type: container
paths:
- /usr/share/filebeat/dockerlogs/${data.docker.container.id}/*.log
processors:
- decode_json_fields:
fields: ["message"]
target: "json"
- config:
- type: container
paths:
- /usr/share/filebeat/dockerlogs/${data.docker.container.id}/*.log
Is there anyone still working on this? I'm running into the same issue.
edit: running 7.5.2
Is there anyone still working on this? I'm running into the same issue.
edit: running 7.5.2
Ditto. I'm not familiar with the Beats code, but I was looking at the pull request that made conditions in autodiscover optional, https://github.com/elastic/beats/pull/9029/files, and the template.GetConfig method seems to be applying the null condition config regardless of whether another condition is met. I thought the null condition config should only be applied if no other condition were met.
Stil exists with filebeat 7.6.1
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I am trying to do some workaround with appenders
filebeat.autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
hints.enabled: true
hints.default_config.enabled: false
templates:
- condition.contains:
kubernetes.labels.sbkslog: "true"
config:
- type: container
paths:
- /var/log/containers/*${data.kubernetes.container.id}.log
add_kubernetes_metadata:
host: ${NODE_NAME}
matchers:
- logs_path:
logs_path: "/var/log/containers/"
processors:
- add_fields:
fields:
fb_conf: "sbkslog"
appenders:
- type: config
config:
- type: container
processors:
- add_fields:
fields:
fb_conf: "default"
paths:
- /var/log/containers/*${data.kubernetes.container.id}.log
add_kubernetes_metadata:
host: ${NODE_NAME}
matchers:
- logs_path:
logs_path: "/var/log/containers/"
Using hints.default_config is not an option because I need to have hints.default_config.enabled: false to disable logging all of containers, I want to collect logs just from containers with hints annotation set to enabled and than use some conditions with DEFAULT option.
When I remove that appenders section I still get logs from containers that does not match first condition in templates section, but where is defined configuration for them? How filebeat knows where to collect logs?
Are there any updates on that? Does not work on 7.6.2
How filebeat knows where to collect logs?
We've been trying to figure this out as well. We're seeing filebeat output this on debug, for our auto discovered non-default_config'd nginx annotated pod:
{
"access": {
"enabled": true,
"input": {
"paths": [
"/var/lib/docker/containers/769cc9f43a9f7eec58a23762a746c92130c8dee7cad4d353dfd7e89e1ac2711f/*-json.log"
],
"stream": "stdout",
"type": "container"
}
},
...
"module": "nginx"
}
but that will not work because it's a containerd based kubernetes cluster and not docker. We can't seem to figure out how to tell filebeat to instead use /var/log/containers/*${data.kubernetes.container.id}.log for those auto discovered and discretely enabled pods.
maybe @exekias can point us in the right direction?
To answer my own question... I decided to dig into the code and found this https://github.com/elastic/beats/blob/a4e5a73af1df3020b5d50d5a198963bd66e5c370/filebeat/autodiscover/builder/hints/logs.go, which led me to trying this
autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
hints:
enabled: true
default_config:
enabled: false
type: container
paths:
- /var/log/containers/*-${data.kubernetes.container.id}.log
which works. The default_config is applied to the events found through autodiscover hints.
To answer my own question... I decided to dig into the code and found this https://github.com/elastic/beats/blob/a4e5a73af1df3020b5d50d5a198963bd66e5c370/filebeat/autodiscover/builder/hints/logs.go, which led me to trying this
autodiscover: providers: - type: kubernetes node: ${NODE_NAME} hints: enabled: true default_config: enabled: false type: container paths: - /var/log/containers/*-${data.kubernetes.container.id}.logwhich works. The default_config is applied to the events found through autodiscover hints.
The problem is now that it looks like you can only provide exactly one default config if you are using hint based autodiscover (at least I couldn't figure it out) You cannot say: "For these logs I want to use this default config if no hints are provided and for these logs I want to use another default config.
Hi! We just realized that we haven't looked into this issue in a while. We're sorry!
We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!
Hi, I have been facing similar issue. Here is my post: https://discuss.elastic.co/t/filebeat-not-logging-hints-enabled-container-logs/310574
Basically, I'm trying to get logs from nginx ingress controller, which is working fine but I can't get hints to work. I would really appreciate if someone got it working and post the solution either to my post or this post.
Below is my filebeat.yaml file content:
apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
name: filebeat
namespace: search
spec:
type: filebeat
version: 7.12.1
elasticsearchRef:
name: elastic-search
kibanaRef:
name: kibana-web
config:
filebeat.autodiscover.providers:
- type: kubernetes
node: ${NODE_NAME}
hints:
enabled: true
#add_resource_metadata.namespace.enabled: true
default_config:
enabled: false
type: container
paths:
- /var/log/containers/*${data.kubernetes.container.id}.log
templates:
- condition:
contains:
kubernetes.pod.name: ingress
config:
- paths: ["/var/log/containers/*${data.kubernetes.container.id}.log"]
type: container
processors:
- add_cloud_metadata: {}
- add_host_metadata: {}
daemonSet:
podTemplate:
spec:
serviceAccountName: filebeat
automountServiceAccountToken: true
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirstWithHostNet
#hostNetwork: true # Allows to provide richer host metadata
containers:
- name: filebeat
securityContext:
runAsUser: 0
# If using Red Hat OpenShift uncomment this:
#privileged: true
volumeMounts:
- name: varlogcontainers
mountPath: /var/log/containers
- name: varlogpods
mountPath: /var/log/pods
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
resources:
requests:
memory: 200Mi
cpu: 0.2
limits:
memory: 300Mi
cpu: 0.4
volumes:
- name: varlogcontainers
hostPath:
path: /var/log/containers
- name: varlogpods
hostPath:
path: /var/log/pods
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers