splunk-connect-for-kubernetes icon indicating copy to clipboard operation
splunk-connect-for-kubernetes copied to clipboard

splunk-kubernetes-logging multiline regex pattern for different pods in namespace

Open naveenb30 opened this issue 2 years ago • 4 comments

What happened: We are using Splunk connect to ingest all application logs from K8s into Splunk for further analysis and we have different type of logs coming into our index from different pods in same namespace and default line break seems like next line character which is breaking multiline log to different events, trying to figure out if we can configure multiple regex pattern for different pods under the same namespace

Current configuration which is working fine but able to configure only single firstLine regex:

global:
  splunk:
    hec:
      protocol: https
      insecureSSL: false
      host: http-inputs-*****.splunkcloud.com
      port: 443
      token: **********
      indexName: app_logs

splunk-kubernetes-logging:
  journalLogPath: /run/log/journal
  splunk:
    hec:
      indexName: app_logs
  logs:
    argo: #namespace
      from:
        pod: "log-*"
      multiline:
        firstline: /timestamp=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
      sourcetype: kube:kube_test

splunk-kubernetes-metrics:
  enabled: false

splunk-kubernetes-objects:
  enabled: false

What you expected to happen: Looking for solution to configure multiple firstLine regex patterns for different pods under same namespace

global:
  splunk:
    hec:
      protocol: https
      insecureSSL: false
      host: http-inputs-*****.splunkcloud.com
      port: 443
      token: **********
      indexName: app_logs

splunk-kubernetes-logging:
  journalLogPath: /run/log/journal
  splunk:
    hec:
      indexName: app_logs
  logs:
# Not working, not sure how to configure multiple regex pattern for different pods under one namespace
   - argo:
      from:
        pod: "log-*"
      multiline:
        firstline: /timestamp=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
      sourcetype: kube:kube_test
#    - argo:
#         from:
#           pod: "spot-*"
#         multiline:
#           firstline: /\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
#         sourcetype: kube:applogs2
splunk-kubernetes-metrics:
  enabled: false

splunk-kubernetes-objects:
  enabled: false

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): v1.21.12-eks-a64ea69
  • OS (e.g: cat /etc/os-release):
  • Splunk version: Splunk Cloud Version: 8.2.2202.1
  • Splunk Connect for Kubernetes helm chart version: 1.4.15
  • Others:

naveenb30 avatar Jun 24 '22 15:06 naveenb30

Hu @naveenb30, you can follow these instructions here: https://github.com/splunk/splunk-connect-for-kubernetes/blob/29ec02a96ac951a9a012f85dbd2ad53e8c8ba2b7/helm-chart/splunk-connect-for-kubernetes/charts/splunk-kubernetes-logging/values.yaml#L160-L236

I am not sure if I understood your issue correctly, but I can see some issue with your config

logs:
# Not working, not sure how to configure multiple regex pattern for different pods under one namespace
   - argo:
      from:
        pod: "log-*"
      multiline:
        firstline: /timestamp=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
      sourcetype: kube:kube_test
   - spot:
        from:
          pod: "spot-*"
        multiline:
          firstline: /\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
        sourcetype: kube:applogs2

If argo and spot are arbitrary names, then ensure you provide the container along with the pod. Internally this config will apply on this file: var.log.containers.<pod-name>*<container-name or config name i.e. argo }}*.log If the file doesn't exist, it will not apply. Also it won't log any error.

hvaghani221 avatar Jul 04 '22 06:07 hvaghani221

@harshit-splunk : Thanks for your response actually argo is the namespace here, not arbitrary name. I would like to identify multiple pods var.log.containers.<pod-name>*<container-name within the same namespace and I'm not quite sure how to configure multiple pods firstLine regex pattern within the same namespace ex: something like this, I know this config has issue but would like to mention in argo namespace I want multiple regex patterns based on pod name

logs:
# Not working, not sure how to configure multiple regex pattern for different pods under one namespace
   - argo:
      from:
        pod: "log-*"
      multiline:
        firstline: /timestamp=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
      sourcetype: kube:kube_test
   - argo:
        from:
          pod: "spot-*"
        multiline:
          firstline: /\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
        sourcetype: kube:applogs2

naveenb30 avatar Jul 05 '22 03:07 naveenb30

You do not have to provide a namespace name in <name>. And including a namespace name isn't necessary. Actual log file name is: <pod_name>_<namespace>_<container_id>.log And the config looks for <pod-name>*<container-name or config name }}*.log You can play with a pod name and container name to target your pod for each pods while having different config name.

hvaghani221 avatar Jul 05 '22 04:07 hvaghani221

@harshit-splunk : Not sure how we can skip <name> here so I found other solution which is working and thanks for the hint, one config work for all pods in namespace(generic) and other config work only incase of specific pod name matches spot-*, only thing is I'm adding argo , argo*, argo**(I'm still not quite sure how I can skip the name here and specify multiple configs)

logs:
    argo*:
      from:
        pod: "spot-*"
      multiline:
        firstline: /\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
      sourcetype: kube:applogs2
    argo:
      from:
        pod: "*"
      multiline:
        firstline: /timestamp=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
      sourcetype: kube:kube_test
      <filter tail.containers.var.log.containers.**argo*.log>
        @type concat
        key log
        timeout_label @PARSE
        stream_identity_key stream
        multiline_start_regexp /timestamp=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
        flush_interval 5
        separator ""
        use_first_timestamp true
      </filter>
      <filter tail.containers.var.log.containers.spot-**argo**.log>
        @type concat
        key log
        timeout_label @PARSE
        stream_identity_key stream
        multiline_start_regexp /\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
        flush_interval 5
        separator ""
        use_first_timestamp true
      </filter>

naveenb30 avatar Jul 18 '22 19:07 naveenb30

Following 2 config will produce same result

logs:
   - argo:
      from:
        pod: "log-*"
      multiline:
        firstline: /timestamp=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
      sourcetype: kube:kube_test
logs:
   - random-name:
      from:
        pod: "log-*"
        container: "argo"
      multiline:
        firstline: /timestamp=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}/
      sourcetype: kube:kube_test

hvaghani221 avatar Aug 17 '22 12:08 hvaghani221