fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

Cannot open chunk. No such file or directory when filesystem is enabled (fluent-bit:v2.1.10)

Open yannick-vandewalle opened this issue 1 year ago • 5 comments

Problem definition

Hello all,

When I redeploy Fluentbit with filesystem storage enabled, I get the error noted below. This error does not occur when running in-memory only when running with a filesystem.

storage-error

The error seems to suggest that chunks for the emitters associated with my rewrite tag steps are not available. Looking at my emits I notice a steep decline, which partially bounces back after a bit of time but doesn't quite reach the original level of traffic. Here the redeployment commenced at 9h30 (it takes some time for the Fluent operator to notice resource changes):

emitter-drop

Chunks down in the filesystem aren't cleared as expected and chunks are taken up into memory at slightly slower rates.

  • purple: storage backlogs
  • orange: rewrite tag chunks

filesystem-overview

During the redeployment a large number of logs are processed indicating that the system is catching up, but we do notice a consistent drop in the average amount of log records processed after the error messages occur. Note around 10h15 the amount of processed messages is consistently lower than around 9h30. It is somewhat hard to make out in this graph, but on average we went from processing 68 records to 35 records.

processed-records

Because we enable quite a high retry limit of 15, no drops seem to occur in the short run, but we do see consistent retries.

retries

When we run out of retries records are dropped and chunks down in the filesystem start to reduce:

records-dropped

filesystem-reduction

Lastly, I would like to note that we have observed that bandwidth limits were (almost) reached for the attached volume. The 3GB filesystem supports a throughput of 110Mib/s and monitoring shows a value of 100Mib/s was reached during redeployment. Since metric resolution happens on a per-minute basis it might be that we did exceed throughput if only for a fraction of a minute.

bandwidth-limit

Current setup

We are using the Fluentbit operator to deploy Fluentbit inside our Kubernetes cluster running on Azure. Storage is backed by the default azurefile-csi-premium storage class (3GB), which is available by default inside Azure's Kubernetes offering. This filesystem is mounted onto 7 running instances of Fluentbit. There is one instance for each node in the cluster. Below is an overview of the flow used during testing. Note, at the end of some steps we rewrite the tag names. I believe the error messages as noted in the beginning, pertain to failed emits for these rewrites.

Step Description Tag rewritten to
Preprocessing Add timestamps, kubernetes metadata, ... output.app
App processing Parsing all application logs, filtering rubbish logs output.loki, output.elasticsearch
Loki processing Filter on logs intended for Loki
Elasticsearch processing Filter on logs intended for Elasticsearch
Post processing Add completion timestamp
flowchart LR
    app_1["App"]
    app_2["App"]
    app_3["App"]
    preprocessing["Preprocessing"]
    app_processing["App processing"]
    loki_processing["Loki processing"]
    elasticsearch_processing["Elasticsearch processing"]
    loki_output["Loki output"]
    elasticsearch_output["Elasticsearch output"]
    post_processing["Post processing"]
    app_1 -->|kube . *| preprocessing
    app_2 -->|kube . *| preprocessing
    app_3 -->|kube . *| preprocessing
    preprocessing -->|output . app| app_processing
    app_processing -->|output . loki| loki_processing
    app_processing -->|output . elasticsearch| elasticsearch_processing
    loki_processing -->|output . *| post_processing
    elasticsearch_processing -->|ouput . *| post_processing
    post_processing -->|output . elasticsearch| elasticsearch_output
    post_processing -->|output . loki| loki_output

ClusterFluentbitconfig resource

apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFluentBitConfig
metadata:
  name: fluentbit-config
  namespace: fluent
  labels:
    app.kubernetes.io/name: fluentbit
spec:
  service:
    httpListen: 0.0.0.0
    httpPort: 2020
    httpServer: true
    parsersFile: parsers.conf
    flushSeconds: 1
    storage:
      path: /tmp/fluent-bit-buffer
      sync: normal
      metrics: "on"
      backlogMemLimit: 900MB
      maxChunksUp: 300
      deleteIrrecoverableChunks: "on"
      checksum: "off"
  inputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
  filterSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
  outputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"

Fluentbit resource

apiVersion: fluentbit.fluent.io/v1alpha2
kind: FluentBit
metadata:
  name: fluentbit
  namespace: fluent
  labels:
    app.kubernetes.io/name: fluentbit
spec:
  image: kubesphere/fluent-bit:v2.1.10
  positionDB:
    hostPath:
      path: /var/lib/fluent-bit/
  resources:
    requests:
      cpu: 250m
      memory: 1000Mi
    limits:
      cpu: 1000m
      memory: 1400Mi
  fluentBitConfigName: fluentbit-config
  tolerations:
    - operator: Exists
  volumes:
    - name: lua-lib
      configMap:
        name: fluentbit-lua-lib
    - name: flb-storage
      persistentVolumeClaim:
        claimName: fluentbit
  volumesMounts:
    - name: lua-lib
      mountPath: /config/lua
    - name: flb-storage
      mountPath: /tmp/fluent-bit-buffer

Rewrite tag filter steps

- customPlugin:
    config: |
      Alias         relabel_app
      Name          rewrite_tag
      Match         kube.*
      Rule          $container_name ^(?!fluent-bit).*$ output.app false
      Emitter_Name  relabel_app
      Emitter_Mem_Buf_Limit 1000M
      Emitter_Storage.type filesystem
          
- customPlugin:
    config: |
      Alias         relabel_to_es_for_app
      Name          rewrite_tag
      Match         input.app
      Rule          $type ^messaging-content$ output.elasticsearch true
      Emitter_Name  relabel_to_es_for_app
      Emitter_Mem_Buf_Limit 1000M
      Emitter_Storage.type filesystem

- customPlugin:
    config: |
      Alias         relabel_to_loki_for_app
      Name          rewrite_tag
      Match         input.app
      Rule          $type ^.*$ output.loki false
      Emitter_Name  relabel_to_loki_for_app
      Emitter_Mem_Buf_Limit 1000M
      Emitter_Storage.type filesystem

Persistent volume claim

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app.kubernetes.io/name: fluentbit
  name: fluentbit
  namespace: fluent
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: azurefile-csi-premium
  resources:
    requests:
      storage: 3Gi

Resulting config file

[Service]
    Flush                                  1
    Http_Listen                            0.0.0.0
    Http_Port                              2020
    Http_Server                            true
    Parsers_File                           parsers.conf
    storage.path                           /tmp/fluent-bit-buffer
    storage.sync                           normal
    storage.checksum                       off
    storage.backlog.mem_limit              900MB
    storage.metrics                        on
    storage.max_chunks_up                  300
    storage.delete_irrecoverable_chunks    on

...

[Filter]
    Match                    kube.*
    Alias                    relabel_app
    Name                     rewrite_tag
    Match                    kube.*
    Rule                     $container_name ^(?!fluent-bit).*$ output.app false
    Emitter_Name             relabel_app
    Emitter_Mem_Buf_Limit    1000M
    Emitter_Storage.type     filesystem

[Filter]
    Match                    output.app
    Alias                    relabel_to_es_for_app
    Name                     rewrite_tag
    Match                    output.app
    Rule                     $type ^messaging-content$ output.elasticsearch true
    Emitter_Name             relabel_to_es_for_app
    Emitter_Mem_Buf_Limit    1000M
    Emitter_Storage.type     filesystem

[Filter]
    Match                    output.app
    Alias                    relabel_to_loki_for_app
    Name                     rewrite_tag
    Match                    output.app
    Rule                     $type ^.*$ output.loki false
    Emitter_Name             relabel_to_loki_for_app
    Emitter_Mem_Buf_Limit    1000M
    Emitter_Storage.type     filesystem

...

[Output]
    name                     es
    match                    output.elasticsearch
    host                     *********** 
    port                     ***********
    tls                      on
    tls.verify               on
    tls.debug                4
    net.dns.mode             TCP
    index                    messages
    http_user                ***********  
    http_passwd              *********** 
    suppress_type_name       On
    trace_output             Off
    trace_error              Off
    retry_limit              15


[Output]
    name                     loki
    match                    output.loki
    host                     ***********  
    labels                   $log_level, $service_name, $service_environment, $type, $log_file_message_entitypath
    port                     *********** 
    tls                      on
    tls.verify               on
    tls.debug                4
    line_format              json
    net.dns.mode             TCP
    retry_limit              15
    net.dns.resolver         LEGACY
    net.keepalive            on

Kubernetes infrastructure

Diagram

kubernetes-infra

Kubectl commands

Retrieving the fluentbit daemonset. Maintains one pod per node in the cluster

daemonset

All pods in the cluster.

pods

The operator managing the daemonset redeployment.

operator

Mount point

mount-point

Resources for a pod

resources

Pod specs

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: '2023-12-12T15:33:30Z'
  generateName: fluentbit-
  labels:
    app.kubernetes.io/name: fluentbit
    controller-revision-hash: 7f845b864b
    pod-template-generation: '2'
  name: fluentbit-r45hs
  namespace: fluent
  ownerReferences:
    - apiVersion: apps/v1
      blockOwnerDeletion: true
      controller: true
      kind: DaemonSet
      name: fluentbit
      uid: 3247a971-951c-436a-b13c-415ebd8fa3fe
  resourceVersion: '388287869'
  uid: 3b38b402-5b5a-4044-b16a-b26c7955187f
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchFields:
              - key: metadata.name
                operator: In
                values:
                  - ********************
  containers:
    - env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: HOST_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.hostIP
      image: kubesphere/fluent-bit:v2.1.10
      imagePullPolicy: IfNotPresent
      name: fluent-bit
      ports:
        - containerPort: 2020
          name: metrics
          protocol: TCP
      resources:
        limits:
          cpu: '1'
          memory: 1400Mi
        requests:
          cpu: 250m
          memory: 1000Mi
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
        - mountPath: /containers
          mountPropagation: None
          name: varlibcontainers
          readOnly: true
        - mountPath: /fluent-bit/config
          name: config
          readOnly: true
        - mountPath: /var/log/
          mountPropagation: None
          name: varlogs
          readOnly: true
        - mountPath: /var/log/journal
          mountPropagation: None
          name: systemd
          readOnly: true
        - mountPath: /config/lua
          name: lua-lib
        - mountPath: /tmp/fluent-bit-buffer
          name: flb-storage
        - mountPath: /fluent-bit/tail
          name: positions
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: kube-api-access-n92v6
          readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: ********************
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: fluentbit
  serviceAccountName: fluentbit
  terminationGracePeriodSeconds: 30
  tolerations:
    - operator: Exists
  volumes:
    - hostPath:
        path: /containers
        type: ''
      name: varlibcontainers
    - name: config
      secret:
        defaultMode: 420
        secretName: fluentbit-config
    - hostPath:
        path: /var/log
        type: ''
      name: varlogs
    - hostPath:
        path: /var/log/journal
        type: ''
      name: systemd
    - configMap:
        defaultMode: 420
        name: fluentbit-lua-lib
      name: lua-lib
    - name: flb-storage
      persistentVolumeClaim:
        claimName: fluentbit
    - hostPath:
        path: /var/lib/fluent-bit/
        type: ''
      name: positions
    - name: kube-api-access-n92v6
      projected:
        defaultMode: 420
        sources:
          - serviceAccountToken:
              expirationSeconds: 3607
              path: token
          - configMap:
              items:
                - key: ca.crt
                  path: ca.crt
              name: kube-root-ca.crt
          - downwardAPI:
              items:
                - fieldRef:
                    apiVersion: v1
                    fieldPath: metadata.namespace
                  path: namespace
status:
  conditions:
    - lastProbeTime: null
      lastTransitionTime: '2023-12-12T15:33:30Z'
      status: 'True'
      type: Initialized
    - lastProbeTime: null
      lastTransitionTime: '2023-12-12T15:33:31Z'
      status: 'True'
      type: Ready
    - lastProbeTime: null
      lastTransitionTime: '2023-12-12T15:33:31Z'
      status: 'True'
      type: ContainersReady
    - lastProbeTime: null
      lastTransitionTime: '2023-12-12T15:33:30Z'
      status: 'True'
      type: PodScheduled
  containerStatuses:
    - containerID: >-
        containerd://b317d95f2cc0ce1e54ab94c6866ce694d67e9a8858a7e82a08f371a0790bb48c
      image: docker.io/kubesphere/fluent-bit:v2.1.10
      imageID: >-
        docker.io/kubesphere/fluent-bit@sha256:62e4e5a4677a8424cf077dc7ddeb38632bef150086a2a4d0e3f5227de0e79682
      lastState: {}
      name: fluent-bit
      ready: true
      restartCount: 0
      started: true
      state:
        running:
          startedAt: '2023-12-12T15:33:31Z'
  hostIP: ********************
  phase: Running
  podIP:  ********************
  podIPs:
    - ip: ********************
  qosClass: Burstable
  startTime: '2023-12-12T15:33:30Z'

Daemonset specs

apiVersion: apps/v1
kind: DaemonSet
metadata:
  creationTimestamp: '2023-12-12T12:44:21Z'
  generation: 2
  labels:
    app.kubernetes.io/name: fluentbit
  name: fluentbit
  namespace: fluent
  ownerReferences:
    - apiVersion: fluentbit.fluent.io/v1alpha2
      blockOwnerDeletion: true
      controller: true
      kind: FluentBit
      name: fluentbit
      uid: 78b16517-b196-404f-bd4a-bf952481f75a
  resourceVersion: '390360959'
  uid: 3247a971-951c-436a-b13c-415ebd8fa3fe
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/name: fluentbit
  template:
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/name: fluentbit
      name: fluentbit
      namespace: fluent
    spec:
      containers:
        - env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: spec.nodeName
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: status.hostIP
          image: kubesphere/fluent-bit:v2.1.10
          imagePullPolicy: IfNotPresent
          name: fluent-bit
          ports:
            - containerPort: 2020
              name: metrics
              protocol: TCP
          resources:
            limits:
              cpu: '1'
              memory: 1400Mi
            requests:
              cpu: 250m
              memory: 1000Mi
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
            - mountPath: /containers
              mountPropagation: None
              name: varlibcontainers
              readOnly: true
            - mountPath: /fluent-bit/config
              name: config
              readOnly: true
            - mountPath: /var/log/
              mountPropagation: None
              name: varlogs
              readOnly: true
            - mountPath: /var/log/journal
              mountPropagation: None
              name: systemd
              readOnly: true
            - mountPath: /config/lua
              name: lua-lib
            - mountPath: /tmp/fluent-bit-buffer
              name: flb-storage
            - mountPath: /fluent-bit/tail
              name: positions
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: fluentbit
      serviceAccountName: fluentbit
      terminationGracePeriodSeconds: 30
      tolerations:
        - operator: Exists
      volumes:
        - hostPath:
            path: /containers
            type: ''
          name: varlibcontainers
        - name: config
          secret:
            defaultMode: 420
            secretName: fluentbit-config
        - hostPath:
            path: /var/log
            type: ''
          name: varlogs
        - hostPath:
            path: /var/log/journal
            type: ''
          name: systemd
        - configMap:
            defaultMode: 420
            name: fluentbit-lua-lib
          name: lua-lib
        - name: flb-storage
          persistentVolumeClaim:
            claimName: fluentbit
        - hostPath:
            path: /var/lib/fluent-bit/
            type: ''
          name: positions
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate
status:
  currentNumberScheduled: 7
  desiredNumberScheduled: 7
  numberAvailable: 7
  numberMisscheduled: 0
  numberReady: 7
  observedGeneration: 2
  updatedNumberScheduled: 7

Redeployment procedure

The Fluent operator is responsible for automatically redeploying th Fluentbit daemonset when any of the CustomResourceDefinitions is altered. For testing, I usually alter the spec.resources.requests.cpu by a value of 1m. This doesn't change the overall CPU usage by much, but does trigger a redeployment. For the resource below I change the CPU and run kubectl apply -f fluentbit.yaml

apiVersion: fluentbit.fluent.io/v1alpha2
kind: FluentBit
metadata:
  name: fluentbit
  namespace: fluent
  labels:
    app.kubernetes.io/name: fluentbit
spec:
  image: kubesphere/fluent-bit:v2.1.10
  positionDB:
    hostPath:
      path: /var/lib/fluent-bit/
  resources:
    requests:
      cpu: 250m
      memory: 1000Mi
    limits:
      cpu: 1000m
      memory: 1400Mi
  fluentBitConfigName: fluentbit-config
  tolerations:
    - operator: Exists
  volumes:
    - name: lua-lib
      configMap:
        name: fluentbit-lua-lib
    - name: flb-storage
      persistentVolumeClaim:
        claimName: fluentbit
  volumesMounts:
    - name: lua-lib
      mountPath: /config/lua
    - name: flb-storage
      mountPath: /tmp/fluent-bit-buffer

Request

Why am I receiving the error messages as stated in the beginning? Have I set unreasonable configuration values or are we dealing with an actual bug in Fluentbit 2.1.10 ?

yannick-vandewalle avatar Dec 13 '23 15:12 yannick-vandewalle

I think it will be helpful to include the actual pod specs as well just to confirm what the K8S translation by the operator actually looks like.

Can you clarify what you mean by redeploy? How and what are you doing to roll out the updates? Do the chunks exist on the filesystem of the original pod but not the new pod? Mostly trying to narrow down the failure as to where it is - is the file missing, corrupt, etc.? Is the error reproducible with a single pod, i.e. no shared PV or filesystem?

patrick-stephens avatar Dec 13 '23 19:12 patrick-stephens

I think it will be helpful to include the actual pod specs as well just to confirm what the K8S translation by the operator actually looks like.

Can you clarify what you mean by redeploy? How and what are you doing to roll out the updates? Do the chunks exist on the filesystem of the original pod but not the new pod? Mostly trying to narrow down the failure as to where it is - is the file missing, corrupt, etc.? Is the error reproducible with a single pod, i.e. no shared PV or filesystem?

Hey Patrick,

Thanks for replying!

I extended the initial documentation with the requested information where possible.

  1. Added a section describing the redeploy procedure
  2. Chunks should exist on the redeployed pods as well, as storage is persistent and gets attach to any new pods.
  3. I'm guessing the file is missing, since it states no such file or directory. While C isn't my area of expertise, i can see the error traces back to the following. So an open for the file is attempted but we end up in the last if statement where the file is missing '-1'.
int cio_file_native_open(struct cio_file *cf)
{
    if (cio_file_native_is_open(cf)) {
        return CIO_OK;
    }

    /* Open file descriptor */
    if (cf->flags & CIO_OPEN_RW) {
        cf->fd = open(cf->path, O_RDWR | O_CREAT, (mode_t) 0600);
    }
    else if (cf->flags & CIO_OPEN_RD) {
        cf->fd = open(cf->path, O_RDONLY);
    }

    if (cf->fd == -1) {
        cio_file_native_report_os_error();

        return CIO_ERROR;
    }

    return CIO_OK;
}
  1. Testing on a single pod is hard, since the deployment is a daemonset deployed on all nodes. This is managed by the operator. Also the operator doesn't seem to provide a way of dynamically allocating multiple filesystems/disks for each individual pod.
  2. I added daemonset and pod specs.
  3. I noticed that we are hitting the higher end of our bandwidth during the redeploy, this might be a potential issue (see newly added graph in documentation)
  4. I added a diagram with an overview of the general architecture

yannick-vandewalle avatar Dec 15 '23 09:12 yannick-vandewalle

I'm encountering the same problem. @yannick-vandewalle, were you able to resolve this issue? If so, could you please share your solution?

mirko-lazarevic avatar Mar 27 '24 10:03 mirko-lazarevic

Are you seeing it in 3.0 or 2.2 @mirko-lazarevic ? 2.1 is no longer supported.

patrick-stephens avatar Mar 27 '24 14:03 patrick-stephens

@patrick-stephens I was seeing the issue with 2.2.0 version, however, I don't observe it any more with 3.0 version.

mirko-lazarevic avatar Apr 19 '24 13:04 mirko-lazarevic

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar Jul 20 '24 01:07 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] avatar Jul 26 '24 01:07 github-actions[bot]