fluent-operator icon indicating copy to clipboard operation
fluent-operator copied to clipboard

bug: The FluentBit CRD without the container security context settings.

Open benz9527 opened this issue 2 years ago • 3 comments

Describe the issue

My Kubernetes Environment

  • Distribution: RedHat OpenShift 4.8.22
  • Container Runtime: CRI-O
  • Worker Node OS: RedHat CoreOS 4.8

The configuration to deploy fluentbit by fluent-operator

ClusterInput

apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterInput
metadata:
  name: omc-fluentbit-tail
  labels:
    fluentbit.fluent.io/enabled: "true"
    omc.fluentbit.input/enabled: "true"
    fluentbit.fluent.io/component: logging
spec:
  tail:
    tag: kube.*
    path: /var/log/containers/*.log
    parser: cri
    refreshIntervalSeconds: 10
    memBufLimit: 5MB
    skipLongLines: true
    db: /var/lib/fluent-bit/pos.db
    dbSync: Normal

ClusterOuput

apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterOutput
metadata:
  name: omc-fluentbit-es
  labels:
    fluentbit.fluent.io/enabled: "true"
    omc.fluentbit.output/enabled: "true"
    fluentbit.fluent.io/component: logging
spec:
  matchRegex: ".*"
  es:
    host: "my-es-url"
    port: 9200
    index: "omc-log"
    generateID: true
    logstashFormat: false
    timeKey: "@timestamp"
    httpUser:
      valueFrom:
        secretKeyRef:
          name: "my-es-user-secret"
          key: "user"
    httpPassword:
      valueFrom:
        secretKeyRef:
          name: "my-es-user-secret"
          key: "elastic"
    traceError: true
    traceOutput: true

ClusterFluentBitConfig

apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFluentBitConfig
metadata:
  name: omc-fluent-bit-config
  labels:
    app.kubernetes.io/name: omc-fluent-bit
spec:
  service:
    parsersFile: parsers.conf
  inputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
      omc.fluentbit.input/enabled: "true"
  filterSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
  outputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
      omc.fluentbit.output/enabled: "true"

FluentBit

apiVersion: fluentbit.fluent.io/v1alpha2
kind: FluentBit
metadata:
  name: omc-fluent-bit
  namespace: ztw-test
  labels:
    app.kubernetes.io/name: omc-fluent-bit
spec:
  image: kubesphere/fluent-bit:v2.0.10
  positionDB:
    emptyDir: {}
  resources:
    requests:
      cpu: 10m
      memory: 25Mi
    limits:
      cpu: 500m
      memory: 200Mi
  fluentBitConfigName: omc-fluent-bit-config
  securityContext: # This securityContex is set for DaemonSet instead of setting for fluentbit container or initContainer. It couldn't be set the 'privileged' as true.
    runAsNonRoot: false
    runAsUser: 0
  tolerations:
    - operator: Exists
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: "node-role.kubernetes.io/worker"
                operator: Exists

The question scenario

When I used upper configuration to deploy a set of fluentbit pods to collect specified pod logs on OpenShift. I got the error like

image

Note: This error could be fixed by setting the fluentbit positionDB as emtyDir

Another one is image

Note: I have made both of the fluent-operator service account (named fluent-operator) and fluentbit service account (named omc-fluent-bit) bond with OpenShift privileged SCC, it still works as upper cases.

image

Finally

I have no idea to continue with fluent-operator to deploy fluent-bit. Then I try another way that I wrote DaemonSet by myself.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: omc-fluent-bit
  labels:
    app.kubernetes.io/name: omc-fluent-bit
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: omc-fluent-bit
  template:
    metadata:
      labels:
        app.kubernetes.io/name: omc-fluent-bit
    spec:
      volumes:
        - name: varlibcontainers
          hostPath:
            path: /var/log/containers
            type: ''
        - name: config
          secret:
            secretName: omc-fluent-bit-config
            defaultMode: 420
        - name: varlogs
          hostPath:
            path: /var/log
            type: ''
        - name: systemd
          hostPath:
            path: /var/log/journal
            type: ''
        - name: positions
          emptyDir: {}
      containers:
        - name: fluent-bit
          image: "kubesphere/fluent-bit:v2.0.10"
          ports:
            - name: metrics
              containerPort: 2020
              protocol: TCP
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: spec.nodeName
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: status.hostIP
          imagePullPolicy: "IfNotPresent"
          volumeMounts:
            - name: varlibcontainers
              readOnly: true
              mountPath: /var/log/containers
            - name: config
              readOnly: true
              mountPath: /fluent-bit/config
            - name: varlogs
              readOnly: true
              mountPath: /var/log/
            - name: systemd
              readOnly: true
              mountPath: /var/log/journal
            - name: positions
              mountPath: /fluent-bit/tail
          securityContext:
            privileged: true # <---- It is herer!!!!!!!!
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      serviceAccountName: omc-fluent-bit
      serviceAccount: omc-fluent-bit
      schedulerName: default-scheduler
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: node-role.kubernetes.io/worker
                    operator: Exists
      tolerations:
        - key: node-role.kubernetes.io/master
          operator: Exists
          effect: NoSchedule
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 0
  revisionHistoryLimit: 10

It works. 35cf43c1d83b265f61cac963cc4a4cf5

To Reproduce

Please use my configurations which were posted at 'Describe the issue'.

Expected behavior

My Kubernetes Environment

  • Distribution: RedHat OpenShift 4.8.22
  • Container Runtime: CRI-O
  • Worker Node OS: RedHat CoreOS 4.8

The configuration to deploy fluentbit by fluent-operator

ClusterInput

apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterInput
metadata:
  name: omc-fluentbit-tail
  labels:
    fluentbit.fluent.io/enabled: "true"
    omc.fluentbit.input/enabled: "true"
    fluentbit.fluent.io/component: logging
spec:
  tail:
    tag: kube.*
    path: /var/log/containers/*.log
    parser: cri
    refreshIntervalSeconds: 10
    memBufLimit: 5MB
    skipLongLines: true
    db: /var/lib/fluent-bit/pos.db
    dbSync: Normal

ClusterOuput

apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterOutput
metadata:
  name: omc-fluentbit-es
  labels:
    fluentbit.fluent.io/enabled: "true"
    omc.fluentbit.output/enabled: "true"
    fluentbit.fluent.io/component: logging
spec:
  matchRegex: ".*"
  es:
    host: "my-es-url"
    port: 9200
    index: "omc-log"
    generateID: true
    logstashFormat: false
    timeKey: "@timestamp"
    httpUser:
      valueFrom:
        secretKeyRef:
          name: "my-es-user-secret"
          key: "user"
    httpPassword:
      valueFrom:
        secretKeyRef:
          name: "my-es-user-secret"
          key: "elastic"
    traceError: true
    traceOutput: true

ClusterFluentBitConfig

apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFluentBitConfig
metadata:
  name: omc-fluent-bit-config
  labels:
    app.kubernetes.io/name: omc-fluent-bit
spec:
  service:
    parsersFile: parsers.conf
  inputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
      omc.fluentbit.input/enabled: "true"
  filterSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
  outputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
      omc.fluentbit.output/enabled: "true"

FluentBit

apiVersion: fluentbit.fluent.io/v1alpha2
kind: FluentBit
metadata:
  name: omc-fluent-bit
  namespace: ztw-test
  labels:
    app.kubernetes.io/name: omc-fluent-bit
spec:
  image: kubesphere/fluent-bit:v2.0.10
  positionDB:
    emptyDir: {}
  resources:
    requests:
      cpu: 10m
      memory: 25Mi
    limits:
      cpu: 500m
      memory: 200Mi
  fluentBitConfigName: omc-fluent-bit-config
  securityContext: # This securityContex is set for DaemonSet instead of setting for fluentbit container or initContainer. It couldn't be set the 'privileged' as true.
    runAsNonRoot: false
    runAsUser: 0
  tolerations:
    - operator: Exists
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: "node-role.kubernetes.io/worker"
                operator: Exists

The question scenario

When I used upper configuration to deploy a set of fluentbit pods to collect specified pod logs on OpenShift. I got the error like

image

Note: This error could be fixed by setting the fluentbit positionDB as emtyDir

Another one is image

Note: I have made both of the fluent-operator service account (named fluent-operator) and fluentbit service account (named omc-fluent-bit) bond with OpenShift privileged SCC, it still works as upper cases.

image

Finally

I have no idea to continue with fluent-operator to deploy fluent-bit. Then I try another way that I wrote DaemonSet by myself.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: omc-fluent-bit
  labels:
    app.kubernetes.io/name: omc-fluent-bit
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: omc-fluent-bit
  template:
    metadata:
      labels:
        app.kubernetes.io/name: omc-fluent-bit
    spec:
      volumes:
        - name: varlibcontainers
          hostPath:
            path: /var/log/containers
            type: ''
        - name: config
          secret:
            secretName: omc-fluent-bit-config
            defaultMode: 420
        - name: varlogs
          hostPath:
            path: /var/log
            type: ''
        - name: systemd
          hostPath:
            path: /var/log/journal
            type: ''
        - name: positions
          emptyDir: {}
      containers:
        - name: fluent-bit
          image: "kubesphere/fluent-bit:v2.0.10"
          ports:
            - name: metrics
              containerPort: 2020
              protocol: TCP
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: spec.nodeName
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: status.hostIP
          imagePullPolicy: "IfNotPresent"
          volumeMounts:
            - name: varlibcontainers
              readOnly: true
              mountPath: /var/log/containers
            - name: config
              readOnly: true
              mountPath: /fluent-bit/config
            - name: varlogs
              readOnly: true
              mountPath: /var/log/
            - name: systemd
              readOnly: true
              mountPath: /var/log/journal
            - name: positions
              mountPath: /fluent-bit/tail
          securityContext:
            privileged: true # <---- It is herer!!!!!!!!
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      serviceAccountName: omc-fluent-bit
      serviceAccount: omc-fluent-bit
      schedulerName: default-scheduler
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: node-role.kubernetes.io/worker
                    operator: Exists
      tolerations:
        - key: node-role.kubernetes.io/master
          operator: Exists
          effect: NoSchedule
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 0
  revisionHistoryLimit: 10

It works.

My Expectation

FluentBit CRD could provide setting item for container security context, not only the one for DaemonSet.

Your Environment

- Fluent Operator version: "kubesphere/fluent-operator:v2.1.0"
- Container Runtime: CRI-O
- Operating system: 
  
NAME="Red Hat Enterprise Linux CoreOS"
VERSION="48.84.202111222303-0"
ID="rhcos"
ID_LIKE="rhel fedora"
VERSION_ID="4.8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux CoreOS 48.84.202111222303-0 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8::coreos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://docs.openshift.com/container-platform/4.8/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform"
REDHAT_BUGZILLA_PRODUCT_VERSION="4.8"
REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform"
REDHAT_SUPPORT_PRODUCT_VERSION="4.8"
OPENSHIFT_VERSION="4.8"
RHEL_VERSION="8.4"
OSTREE_VERSION='48.84.202111222303-0'
  • Kernel version: 4.18.0-305.28.1.el8_4.x86_64


### How did you install fluent operator?

Helm chart from this repo release.

### Additional context

_No response_

benz9527 avatar Mar 31 '23 01:03 benz9527

Below is pod-level security context. image

But OpenShift need is container security context: image

image

benz9527 avatar Mar 31 '23 01:03 benz9527

Why the issue expectation will display duplicate as issue description?

benz9527 avatar Mar 31 '23 01:03 benz9527

The error /var/log/containers is probably a problem with the mount path, and fluentbit is not mounted. You can set this https://github.com/fluent/fluent-operator/blob/c56cb02ef27231ebea37b62b4f40e794e7d9aa41/apis/fluentbit/v1alpha2/fluentbit_types.go#L77

wenchajun avatar Apr 03 '23 10:04 wenchajun