fluent-operator
fluent-operator copied to clipboard
FluentBit Sidecar Injection
Is your feature request related to a problem? Please describe.
Hello! I work for a really big Italian Company in the Energy sector and we are creating a log pipeline solution using FluentBit as a sidecar container sending record to FluentD deployment pods. We are onboarding hundreds of microservices, so there is a lot of repetitive work to do right now.
Describe the solution you'd like
So, I'm wondering if a sidecar injector pattern ( the same you find in opentelemetry, envoy, ecc...) with a combination of pod metadata annotations and admission mutating webhook is something that are you considering to implement and if not why so?
Additional context
I've tried to find related issues but I didn't find anything. Yet, I don't know the fluent operator project codebase, so maybe what I am asking is not feasibile in the first place, but since this is a real world scenario we are facing, maybe it could be in the community interest.
To conclude, I would be willing to contribute to implement this feature if you express interest in it.
It's not on the roadmap yet. But Fluent operator maintainers are open to seeing proposals and contributions for this feature. @AlessandroFazio
Hello @benjaminhuo , I'm glad to see you open to a proposal. I have taken some time to come up with the more formal feature request which follows.
FluentBit Sidecar Feature Request
Disclosure
Before proceeding with the technical content, I would like to stress 2 points:
- I haven’t had the time to look closely at the Fluent Operator codebase, so I could have missed some points in this initial draft of the solution.
- This is not meant to be an exhaustive description of the feature, it is purely written for you to know more concretely (and likely me too) what I have in mind and how it can be achieved.
Introduction
This feature request aims at adding the ability for FluentOperator users to inject FluentBit as a sidecar container in pods.
This container should in some way featch log events produced by main application container and forward them to some destination. It could be FluentD and in this way leverage the FluentD CR provided by the operator or some other destionation.
For fetching log events I proposed here to use an emptyDir volume, where the app writes and the FluentBit reads. In the General Considerations section below I explained the reason behind this choice.
This solution requires making the manager to bootstrap a new Webhook Server (which is not already bootstrapped in the operator codebase for what I have seen) serving at least a mutating admission webhook under some path, could be /mutate-core-v1-pods
.
Configuration
The mutating webhook could be configured using a ConfigMap added as part of the manifests and mounted as a volume in the controller-manager container. The configMap can be read from its mountPath adding an additional flag in the main.go file and adding common logic for loading its content in go program memory.
The user should opt in for the sidecar injector, i.e the feature would be disabled by default.
To enable the injection logic to take place, edit the first property enabled: bool
.
So by default the webhook server will be part of the controller-manager process, basically doing nothing and returning pod object as is.
The ConfigMap could like as the following:
apiVersion: admissionregistration.k8s.io/v1
kind: ConfigMap
metadata:
name: fluent-sidecar-injector
...
data: |
enabled: true
fluentBitImage: kube-sphere/fluent-bit:tag
sidecarRequestsCPU: 100m
...
sidecarLimitsCPU: 300m
...
Other webhook manifests like Deployment, Service, CertManager related manifests, etc… are skipped for the sake of brevity, but naturally required.
The webhook configuration manifest could look something like this:
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
webhooks:
- name: fluent-sidecar.example.com
namespaceSelector:
matchLabels:
fluent.sidecar.io/enabled: true
rules:
- apiGroups: [""]
apiVersions: ["v1"]
operations: ["CREATE"]
resources: ["pods"]
scope: "Namespaced"
...
Sidecar Injection Logic
The sidecar injection implementation follows the ‘Istio way’. You should label the namespace with fluent.sidecar.io/enabled: true
, then for each pod created in this namespace the webhook will be called by the APIServer. This way you can limit the number of pods the webhook will process, based on the Namespace selector configuration.
Then, if the user specifies fluent.sidecar.io/inject: false
in the pod metadata.annotations[] , injection will be skipped.
Sidecar customization can be achieved either through:
- Pod metadata annotations
- Partial configuration of FluentBit sidecar in pod.spec.containers
Pod Metadata Annotations
The following pod metadata.annotations[] let the user customize the sidecar injection execution logic.
- Key: fluent.sidecar.io/inject
Value: bool
Desc: specify whether the webhook should inject the sidecar
- Key: fluent.sidecar.io/application-logs/path
Value: string
Desc: specify the path on app container directory where app logger appender will write log files
- Key: fluent.sidecar.io/position-db/volume
Value: bool
Desc: specify whether the webhook should inject the position-db volume
- Key: fluent.sidecar.io/applications-logs/volume-size-limit
Value: quantity
Desc: specify the application logs volume size limit
- Key: fluent.sidecar.io/sidecar/request-cpu
Value: quantity
Desc: specify the FluenBit sidecar container resource.requests.cpu
- Key: fluent.sidecar.io/sidecar/request-memory
Value: quantity
Desc: specify the FluenBit sidecar container resource.requests.memory
- Key: fluent.sidecar.io/sidecar/request-ephemeral-storage
Value: quantity
Desc: specify the FluenBit sidecar container resource.requests.ephemeral-storage
- Key: fluent.sidecar.io/sidecar/limit-cpu
Value: quantity
Desc: specify the FluenBit sidecar container resource.limit.cpu
- Key: fluent.sidecar.io/sidecar/limit-memory
Value: quantity
Desc: specify the FluenBit sidecar container resource.limit.memory
- Key: fluent.sidecar.io/sidecar/limit-ephemeral-storage
Value: quantity
Desc: specify the FluenBit sidecar container resource.limit.ephemeral-storage
- Key: fluent.sidecar.io/sidecar/image
Value: string
Desc: specify the FluenBit sidecar container image
Partial Sidecar Configuration
The user could include directly the fluent-bit sidecar container in the pod .spec.containers[]. The partial configuration could look like the following:
apiVersion: apps/v1
kind: Pod
metadata:
name: example
spec:
containers:
# other containers #
name: fluent-bit
image: auto
...
If this is done, the following will happen:
- Webhook will know the pod needs to be processed
- Webhook will check if the image field is set to ‘auto’. If so it will replace it with the kube-sphere FluentBit image
- Webhook will add the usual components to FluentBit container if not already specified by the user
This customization option is found in Istio sidecar injection implementation and offers both flexibility and ease of implementation (either for users and us as developers)
Reconcile Logic
Here is described in short the webhook business logic:
-
Check if pod should not be processed depending on: a) If sidecar injector config property
enabled: false
b) If pod metadata.annotations[] specifyfluent.sidecar.io/inject: false
-
Add to pod spec.volumes[] the following: a) Application logs emptyDir volume b) FluentBit config emptyDir volume c) FluentBitDb emptyDir volume
-
Create FluentBit sidecar container
-
Mount inside the FluentBit sidecar container: b) FluentBit config volume c) FluentBitDb volume if specified in the d) ApplicationLogs volume
-
Add FluentBit sidecar container in pod spec.containers[]
-
Mount ApplicationLogs volume into application container at specified mount point
General Considerations
A natural decision is to leverage CRs already provided by the Fluent Operator, such as the configuration-related CRs. Both the namespaced and the cluster-level configuration CRs can be leveraged depending on the user needs.
In both cases the FluentBit configuration secret will be mounted as a volume inside the FluentBit sidecar container.
However, an issue can arise when the user deploys the operator but not the configuration CRs. There is really low probability this will happen, but it can happen. In this situation the secret would not be created and if some namespaces are matched, scheduled pods will be stuck in the ContainerCreating phase indefinitely, since they are trying to mount a non-existent secret.
To overcome this issue I have came across these 2 solutions:
- Create a validation webhook to check for secret exists and reject the pod if this is not the case
- Use an InitContainer which queries the APIServer for secret in its namespace equipped with appropriate service account permissions. The InitContainer will fail on timeout exceeded or resource not found error, causing the pod to fail.
As you can imagine they are not mutually exclusive, in the sense that the validation webhook can be useful at least to send warnings to the user.
You may notice that, given this issue, it becomes even more important to give the user the ability to skip the injection using pod metadata annotations. I guess there are other ways to deal with this, I haven't had the time to think about it yet. I hope you can help me in this regard.
For what concerns the emptyDir volume solution to store applications log files written by app and read by FluentBit, we actually came up with this solution at my company. Avoiding weird tricks, for example I’ve seen one somewhat weird using linux pipes, we found no easy way to make log events available to FluentBit process in separate container other than mounting that volume and reading files in it.
To conclude, I haven't talked about kustomize or helm configuration. I know this is a really crucial topic, indeed if this sofware is not easy to install, it looses lot of its value. However, I preferred to discuss the business logic and related issues for now. If the discussion about this feature will progress, we will have the time for think about the deployment side of things.
Well, this is the end for now. I know that this description is far from complete, but we have to start from something. Let me know what you think about it with some feedback.
@AlessandroFazio I think the feature is overall useful, but from what I've seen, in k8s sidecar is only useful for logging when the application isn't writing logs to stdout and instead writes it to some file in the container.
Questions or comments on your proposal:
- You mentioned using existing CRs to configure fluent-bit sidecar. Will this result in creation of a secret per pod that'd be mounted?
- I wouldn't recommend using the same set of CRs for configuring sidecars, since logging sidecar configs are specific to pods being injected into (or the application running in the pod, you'd use a different filter/parser for different applications), so the sidecar configs don't really have a cluster or namespace scope.
Hello @adiforluls , I'm glad to see some feedbacks on this feature.
To address the points you raised:
-
Based on my experience, choosing between writing logs to stdout or the filesystem isn't necessarily an either-or decision. Most mature logging libraries in common programming languages allow for configuring handlers to output logs to multiple streams simultaneously. This setup is common in several large projects at the company where I currently work, where applications write logs both to a file mounted as a container volume and to stdout and then these logs follow different path downstreams.
-
Regarding the handling of secrets, they should be created as usual using Configuration CRs and then mounted by each instrumented Pod in the Namespace. I didn't clarify this point sufficiently in the initial draft proposal: it's advisable to set up a default Secret per Namespace for mounting configuration into containers. This can be specified within metadata.annotation[] in the Namespace manifest. Here you can also add a flag to specify if injection for this Namespace should be enabled, provided that the Namespace is matched by the Webhook Namespace Selector, otherwise this is simply a decoration. Then, using Pod metadata annotations[], you can override the default Secret name as needed. Typically, it's a best practice and common to maintain consistent logging patterns and formats across applications grouped into a project, preferably ones that are parseable by a common parser. However, and I'm facing such a case right now, this is not always possible. If a common parser doesn't fit all applications needs in the Namespace, you still have the flexibility to override the name of the Secre to mount. Going to the extreme, in cases where each logging format or pattern differs, you can still delegate parsing to aggregators and have FluentBit sidecars act solely as forwarders, optionally adding metadata for disambiguation downstream (often relevant at the Namespace level, based on my experience).