help request: How to create ClusterParser and correctly map fields corresponding to logs in nested json format to es
Description
The pod console on k8s outputs logs in json format, how to map the fields of json to elasticsearch using the correct filter
fluent-operator is deployed in kubernetes
version:
kubernetes:1.20.6
fluent-operator: v1.0.0
fluentbit:v1.8.11
runtime:docker 20.10.15
elasticsearch:7.16.2
我有以下嵌套的 json 日志
{"@timestamp":"2022-06-22T12:10:37.6374580+08:00","level":"Warning","messageTemplate":"No XML encryptor configured. Key {KeyId:B} may be persisted to storage in unencrypted form.","message":"No XML encryptor configured. Key {48f944fe-a365-43b8-a6d5-148057337a34} may be persisted to storage in unencrypted form.","fields":{"KeyId":"48f944fe-a365-43b8-a6d5-148057337a34","EventId":{"Id":35,"Name":"NoXMLEncryptorConfiguredKeyMayBePersistedToStorageInUnencryptedForm"},"SourceContext":"Microsoft.AspNetCore.DataProtection.KeyManagement.XmlKeyManager"},"renderings":{"KeyId":[{"Format":"B","Rendering":"{48f944fe-a365-43b8-a6d5-148057337a34}"}]}}
{"@timestamp":"2022-06-22T12:01:18.4709632+08:00","level":"Information","messageTemplate":"请求抖音接口耗时:140,单位:毫秒,响应参数:{\"log_id\":\"2022622201010130545821338B4F\",\"data\":null,\"err_no\":31008,\"message\":\"生成token失败, token已过期\"}","message":"test:140,单位:毫秒,响应参数:{\"log_id\":\"2022062212018103541581338B4F\",\"data\":null,\"err_no\":31008,\"message\":\"生成token失败, token已过期\"}","fields":{"SourceContext":"Express.Providers.DouYinPlatformProvider"}}
Below is my configuration ClusterInput.yaml
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterInput
metadata:
annotations:
meta.helm.sh/release-name: fluent-operator
meta.helm.sh/release-namespace: logging-system
labels:
managed-by: Helm
fluentbit.fluent.io/component: hub-yhwms-eventtracing
fluentbit.fluent.io/enabled: "true"
name: test
spec:
tail:
db: /fluent-bit/tail/test.db
dbSync: Normal
memBufLimit: 5MB
parser: test-parser
path: /var/log/containers/*hub_hub-yhwms-express-api*
refreshIntervalSeconds: 10
skipLongLines: true
tag: test.*
clusterfilters.yaml
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFilter
metadata:
annotations:
meta.helm.sh/release-name: fluent-operator
meta.helm.sh/release-namespace: logging-system
labels:
app.kubernetes.io/managed-by: Helm
fluentbit.fluent.io/component: logging
fluentbit.fluent.io/enabled: "true"
name: test
spec:
filters:
- kubernetes:
annotations: false
kubeCAFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
kubeTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
kubeURL: https://kubernetes.default.svc:443
labels: false
- nest:
addPrefix: kubernetes_
nestedUnder: kubernetes
operation: lift
- modify:
rules:
- remove: stream
- remove: kubernetes_pod_id
- remove: kubernetes_host
- remove: kubernetes_container_hash
- remove: time
- nest:
nestUnder: kubernetes
operation: nest
removePrefix: kubernetes_
wildcard:
- kubernetes_*
match: test.*
ClusterOutput.yaml
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterOutput
metadata:
annotations:
meta.helm.sh/release-name: fluent-operator
meta.helm.sh/release-namespace: logging-system
labels:
fluentbit.fluent.io/component: hub-yhwms-eventtracing
fluentbit.fluent.io/enabled: "true"
name: test
spec:
es:
generateID: true
host: 192.168.100.220
httpPassword:
valueFrom:
secretKeyRef:
key: password
name: test-test
httpUser:
valueFrom:
secretKeyRef:
key: username
name: test-test
logstashFormat: true
logstashPrefix: test-json
port: 31082
timeKey: '@timestamp'
matchRegex: (?:test|service)\.(.*)
clusterparser.yaml
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterParser
metadata:
name: test-parser
labels:
fluentbit.fluent.io/enabled: "true"
spec:
decoders:
- decodeField: escaped_utf8 message do_next
- decodeFieldAs: escaped docker message
json:
timeFormat: "%Y-%m-%dT%H:%M:%S.%L"
timeKeep: true
timeKey: "@timestamp"
How did you install fluent operator?
helm install fluent-operator fluent-operator -n logging-system
Additional context
The current configuration outputs all logs to a field of es

@hyt05 this issue was automatically closed because it did not follow the issue template
You can refer this https://github.com/fluent/fluent-operator#custom-parser
你可以参考这个https://github.com/fluent/fluent-operator#custom-parser 文档很简洁没有有效的帮助
This is the example https://github.com/fluent/fluent-operator/tree/master/manifests/regex-parser
thank you very much
Close this for now. You can reopen this whenever you want
@huangyutongs have you managed how to do that? I have the same problem.
@huangyutongs have you managed how to do that? I have the same problem.
Just set clusterfilters.fluentbit.fluent.io.spec.filters.kubernetes.mergeLog to true,It will parse json correctly.
You may also be interested in this option clusterfilters.fluentbit.fluent.io.spec.filters.kubernetes.mergeLogTrim
This is the configuration list I use in production
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFilter
metadata:
annotations:
meta.helm.sh/release-name: oak-adapter-api
meta.helm.sh/release-namespace: oak
labels:
app.kubernetes.io/managed-by: Helm
fluentbit.fluent.io/component: oak-adapter-api
fluentbit.fluent.io/enabled: "true"
name: oak-adapter-api
spec:
filters:
- lua:
call: containerd
script:
key: containerd.lua
name: fluent-bit-containerd-config
timeAsTable: true
- kubernetes:
annotations: false
keepLog: false
kubeCAFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
kubeTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
kubeURL: https://kubernetes.default.svc:443
labels: false
mergeLog: true
mergeLogTrim: true
- nest:
addPrefix: kubernetes_
nestedUnder: kubernetes
operation: lift
- modify:
rules:
- remove: stream
- remove: kubernetes_pod_id
- remove: kubernetes_host
- remove: kubernetes_container_hash
- remove: time
- remove: logtag
- remove: fields.RequestPath.keyword
- nest:
nestUnder: kubernetes
operation: nest
removePrefix: kubernetes_
wildcard:
- kubernetes_*
match: oak-adapter-api*