fluent-operator FluentBit Sidecar Injection

Is your feature request related to a problem? Please describe.

Hello! I work for a really big Italian Company in the Energy sector and we are creating a log pipeline solution using FluentBit as a sidecar container sending record to FluentD deployment pods. We are onboarding hundreds of microservices, so there is a lot of repetitive work to do right now.

Describe the solution you'd like

So, I'm wondering if a sidecar injector pattern ( the same you find in opentelemetry, envoy, ecc...) with a combination of pod metadata annotations and admission mutating webhook is something that are you considering to implement and if not why so?

Additional context

I've tried to find related issues but I didn't find anything. Yet, I don't know the fluent operator project codebase, so maybe what I am asking is not feasibile in the first place, but since this is a real world scenario we are facing, maybe it could be in the community interest.

To conclude, I would be willing to contribute to implement this feature if you express interest in it.

Jan 29 '24 18:01 AlessandroFazio

It's not on the roadmap yet. But Fluent operator maintainers are open to seeing proposals and contributions for this feature. @AlessandroFazio

Jan 30 '24 01:01 benjaminhuo

Hello @benjaminhuo , I'm glad to see you open to a proposal. I have taken some time to come up with the more formal feature request which follows.

FluentBit Sidecar Feature Request

Disclosure

Before proceeding with the technical content, I would like to stress 2 points:

I haven’t had the time to look closely at the Fluent Operator codebase, so I could have missed some points in this initial draft of the solution.
This is not meant to be an exhaustive description of the feature, it is purely written for you to know more concretely (and likely me too) what I have in mind and how it can be achieved.

Introduction

This feature request aims at adding the ability for FluentOperator users to inject FluentBit as a sidecar container in pods.

This container should in some way featch log events produced by main application container and forward them to some destination. It could be FluentD and in this way leverage the FluentD CR provided by the operator or some other destionation.

For fetching log events I proposed here to use an emptyDir volume, where the app writes and the FluentBit reads. In the General Considerations section below I explained the reason behind this choice.

This solution requires making the manager to bootstrap a new Webhook Server (which is not already bootstrapped in the operator codebase for what I have seen) serving at least a mutating admission webhook under some path, could be /mutate-core-v1-pods.

Configuration

The mutating webhook could be configured using a ConfigMap added as part of the manifests and mounted as a volume in the controller-manager container. The configMap can be read from its mountPath adding an additional flag in the main.go file and adding common logic for loading its content in go program memory.

The user should opt in for the sidecar injector, i.e the feature would be disabled by default. To enable the injection logic to take place, edit the first property enabled: bool. So by default the webhook server will be part of the controller-manager process, basically doing nothing and returning pod object as is.

The ConfigMap could like as the following:

apiVersion: admissionregistration.k8s.io/v1
kind: ConfigMap
metadata:
 name: fluent-sidecar-injector
 ...
data: |
  enabled: true
  fluentBitImage: kube-sphere/fluent-bit:tag
  sidecarRequestsCPU: 100m
  ...
  sidecarLimitsCPU: 300m
  ...

Other webhook manifests like Deployment, Service, CertManager related manifests, etc… are skipped for the sake of brevity, but naturally required.

The webhook configuration manifest could look something like this:

apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
webhooks:
 - name: fluent-sidecar.example.com
   namespaceSelector:
     matchLabels:
       fluent.sidecar.io/enabled: true
       rules:
      - apiGroups:   [""]
  apiVersions: ["v1"]
  operations:  ["CREATE"]
  resources:   ["pods"]
  scope:       "Namespaced"
...

Sidecar Injection Logic

The sidecar injection implementation follows the ‘Istio way’. You should label the namespace with fluent.sidecar.io/enabled: true, then for each pod created in this namespace the webhook will be called by the APIServer. This way you can limit the number of pods the webhook will process, based on the Namespace selector configuration. Then, if the user specifies fluent.sidecar.io/inject: false in the pod metadata.annotations[] , injection will be skipped.

Sidecar customization can be achieved either through:

Pod metadata annotations
Partial configuration of FluentBit sidecar in pod.spec.containers

Pod Metadata Annotations

The following pod metadata.annotations[] let the user customize the sidecar injection execution logic.

- Key: fluent.sidecar.io/inject 
  Value: bool
  Desc: specify whether the webhook should inject the sidecar
- Key: fluent.sidecar.io/application-logs/path 
  Value: string
  Desc: specify the path on app container directory where app logger appender will write log files
- Key: fluent.sidecar.io/position-db/volume 
  Value: bool
  Desc: specify whether the webhook should inject the position-db volume
- Key: fluent.sidecar.io/applications-logs/volume-size-limit 
  Value: quantity
  Desc: specify the application logs volume size limit
- Key: fluent.sidecar.io/sidecar/request-cpu 
  Value: quantity
  Desc: specify the FluenBit sidecar container resource.requests.cpu
- Key: fluent.sidecar.io/sidecar/request-memory 
  Value: quantity
  Desc: specify the FluenBit sidecar container resource.requests.memory
- Key: fluent.sidecar.io/sidecar/request-ephemeral-storage
  Value: quantity
  Desc: specify the FluenBit sidecar container resource.requests.ephemeral-storage
- Key: fluent.sidecar.io/sidecar/limit-cpu 
  Value: quantity
  Desc: specify the FluenBit sidecar container resource.limit.cpu
- Key: fluent.sidecar.io/sidecar/limit-memory
  Value: quantity
  Desc: specify the FluenBit sidecar container resource.limit.memory
- Key: fluent.sidecar.io/sidecar/limit-ephemeral-storage
  Value: quantity
  Desc: specify the FluenBit sidecar container resource.limit.ephemeral-storage
- Key: fluent.sidecar.io/sidecar/image
  Value: string
  Desc: specify the FluenBit sidecar container image

Partial Sidecar Configuration

The user could include directly the fluent-bit sidecar container in the pod .spec.containers[]. The partial configuration could look like the following:

apiVersion: apps/v1
kind: Pod
metadata:
  name: example
spec:
  containers:
   # other containers #
name: fluent-bit
image: auto
...

If this is done, the following will happen:

Webhook will know the pod needs to be processed
Webhook will check if the image field is set to ‘auto’. If so it will replace it with the kube-sphere FluentBit image
Webhook will add the usual components to FluentBit container if not already specified by the user

This customization option is found in Istio sidecar injection implementation and offers both flexibility and ease of implementation (either for users and us as developers)

Reconcile Logic

Here is described in short the webhook business logic:

Check if pod should not be processed depending on: a) If sidecar injector config property enabled: false b) If pod metadata.annotations[] specify fluent.sidecar.io/inject: false
Add to pod spec.volumes[] the following: a) Application logs emptyDir volume b) FluentBit config emptyDir volume c) FluentBitDb emptyDir volume
Create FluentBit sidecar container
Mount inside the FluentBit sidecar container: b) FluentBit config volume c) FluentBitDb volume if specified in the d) ApplicationLogs volume
Add FluentBit sidecar container in pod spec.containers[]
Mount ApplicationLogs volume into application container at specified mount point

General Considerations

A natural decision is to leverage CRs already provided by the Fluent Operator, such as the configuration-related CRs. Both the namespaced and the cluster-level configuration CRs can be leveraged depending on the user needs.

In both cases the FluentBit configuration secret will be mounted as a volume inside the FluentBit sidecar container.

However, an issue can arise when the user deploys the operator but not the configuration CRs. There is really low probability this will happen, but it can happen. In this situation the secret would not be created and if some namespaces are matched, scheduled pods will be stuck in the ContainerCreating phase indefinitely, since they are trying to mount a non-existent secret.

To overcome this issue I have came across these 2 solutions:

Create a validation webhook to check for secret exists and reject the pod if this is not the case
Use an InitContainer which queries the APIServer for secret in its namespace equipped with appropriate service account permissions. The InitContainer will fail on timeout exceeded or resource not found error, causing the pod to fail.

As you can imagine they are not mutually exclusive, in the sense that the validation webhook can be useful at least to send warnings to the user.

You may notice that, given this issue, it becomes even more important to give the user the ability to skip the injection using pod metadata annotations. I guess there are other ways to deal with this, I haven't had the time to think about it yet. I hope you can help me in this regard.

For what concerns the emptyDir volume solution to store applications log files written by app and read by FluentBit, we actually came up with this solution at my company. Avoiding weird tricks, for example I’ve seen one somewhat weird using linux pipes, we found no easy way to make log events available to FluentBit process in separate container other than mounting that volume and reading files in it.

To conclude, I haven't talked about kustomize or helm configuration. I know this is a really crucial topic, indeed if this sofware is not easy to install, it looses lot of its value. However, I preferred to discuss the business logic and related issues for now. If the discussion about this feature will progress, we will have the time for think about the deployment side of things.

Well, this is the end for now. I know that this description is far from complete, but we have to start from something. Let me know what you think about it with some feedback.

Feb 01 '24 22:02 AlessandroFazio

@AlessandroFazio I think the feature is overall useful, but from what I've seen, in k8s sidecar is only useful for logging when the application isn't writing logs to stdout and instead writes it to some file in the container.

Questions or comments on your proposal:

You mentioned using existing CRs to configure fluent-bit sidecar. Will this result in creation of a secret per pod that'd be mounted?
I wouldn't recommend using the same set of CRs for configuring sidecars, since logging sidecar configs are specific to pods being injected into (or the application running in the pod, you'd use a different filter/parser for different applications), so the sidecar configs don't really have a cluster or namespace scope.

May 13 '24 09:05 adiforluls

Hello @adiforluls , I'm glad to see some feedbacks on this feature.

To address the points you raised:

Based on my experience, choosing between writing logs to stdout or the filesystem isn't necessarily an either-or decision. Most mature logging libraries in common programming languages allow for configuring handlers to output logs to multiple streams simultaneously. This setup is common in several large projects at the company where I currently work, where applications write logs both to a file mounted as a container volume and to stdout and then these logs follow different path downstreams.
Regarding the handling of secrets, they should be created as usual using Configuration CRs and then mounted by each instrumented Pod in the Namespace. I didn't clarify this point sufficiently in the initial draft proposal: it's advisable to set up a default Secret per Namespace for mounting configuration into containers. This can be specified within metadata.annotation[] in the Namespace manifest. Here you can also add a flag to specify if injection for this Namespace should be enabled, provided that the Namespace is matched by the Webhook Namespace Selector, otherwise this is simply a decoration. Then, using Pod metadata annotations[], you can override the default Secret name as needed. Typically, it's a best practice and common to maintain consistent logging patterns and formats across applications grouped into a project, preferably ones that are parseable by a common parser. However, and I'm facing such a case right now, this is not always possible. If a common parser doesn't fit all applications needs in the Namespace, you still have the flexibility to override the name of the Secre to mount. Going to the extreme, in cases where each logging format or pattern differs, you can still delegate parsing to aggregators and have FluentBit sidecars act solely as forwarders, optionally adding metadata for disambiguation downstream (often relevant at the Namespace level, based on my experience).

May 14 '24 22:05 AlessandroFazio

fluent-operator fluent-operator copied to clipboard

FluentBit Sidecar Injection

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Additional context

fluent-operator
fluent-operator copied to clipboard