fluent-operator icon indicating copy to clipboard operation
fluent-operator copied to clipboard

Support custom plugins for fluentbit

Open wenchajun opened this issue 3 years ago • 20 comments

Is your feature request related to a problem? Please describe.

fluentbit and fluentd have an unusually rich way of collecting and processing logs, mainly due to their multiple plugins. But fluent-operator does not integrate all plug-ins, and there should be a priority for the order of integrating plug-ins, which plug-ins do you want to integrate most?

Describe the solution you'd like

We hope we will make suggestions and we will rank the plugin development integration according to priority.

Describe alternatives you've considered

The integration of various plug-ins is integrated by customizing the CRD of the plug-ins, the steps required are roughly the same, and you are welcome to contribute code to improve the integration of plug-ins.

Additional context

No response

wenchajun avatar May 21 '22 15:05 wenchajun

I don't know if FluentBit supports OTLP for logs (public docs would indicate it's only tracing) but I would love to see OTLP for logs.

frankgreco avatar May 27 '22 16:05 frankgreco

OTLP for logs is just a spec to specify log format? https://opentelemetry.io/docs/reference/specification/logs/overview/ If so looks like it can be a separate input plugin in fluentbit
@patrick-stephens @agup006 What do you think?

benjaminhuo avatar May 30 '22 03:05 benjaminhuo

My main concern around this is having a bespoke CRD for every plugin: a plugin has it's own configuration which is essentially bespoke per plugin and then we wrap this in another interface just for the CRD.

Why can't we just have a generic one that allows us to be flexible for every plugin as it evolves - how is the current approach going to handle changes in configuration parameters in different versions?

Particularly now we have YAML support as well, it seems weird to have YAML configuration for Fluent Bit but then different YAML configuration for the operator. Can we align them to simplify and reduce effort for the operator too?

patrick-stephens avatar Jun 06 '22 09:06 patrick-stephens

If each plugin is a CRD, too many plugins may result in over-complex config generation logic. And too many CRDs to watch. We'll upgrade FluentBit and operator sooner enough after a new version of FB is released.

I don't understand the 3rd concern: The FluentBit Yaml is to define how the fluent bit daemonset will be deployed. The yaml for operator is just a deployment. You mean to combine the FluentBit and other CRDs like ClusterInput, ClusterFilter and ClusterOutput ?

benjaminhuo avatar Jun 08 '22 09:06 benjaminhuo

Particularly now we have YAML support as well

I think @patrick-stephens is referring to this. Personally, I don't really have an opinion on whether you keep them the same or not. The reality is they won't ever map one to one as we have secret references and other things that can and should be k8s specific.

Anyways, one CRD per FluentBit concept (i.e. Filter, Output, Input) breaks down if there is a breaking change made. If there are no breaking changes made, it's a non issue. If a breaking change is made, the only obvious way to support it in the CRD is to have DatadogV2: ....

I would actually be in favor of a CRD per plugin. I don't think "over-complex config generation logic" and "too many CRDs to watch" are architecture-deterring concerns. That being said, it does not actually help with the concerns mentioned as the CRD version is the group version and not the kind version (unless there's been some change). Hence, there's no difference between multiple CRDs or one.

Finally, I don't think this conversation is a tangent from the original question. Changing the CRD architecture shouldn't be a blocker on iterating on plugin backfill.

frankgreco avatar Jun 08 '22 19:06 frankgreco

Finally, I don't think this conversation is a tangent from the original question. Changing the CRD architecture shouldn't be a blocker on iterating on plugin backfill.

Agreed.

I would caution on CRD updates (i.e. updating CRDs in an installed system) as they can be messy, particularly for Helm users as Helm does not upgrade them so make sure to document any migration issues. However this is all manageable and again a tangent.

Essentially what we're saying though is the CRD is in some fashion coupled to the Fluent Bit version, e.g. a new setting or plugin introduced in version X+1 has to wait for a CRD update to be used even if you just change the image. Similarly say we wanted to use a 1.8 image then we have to find the matching CRD if the latest one triggers config errors.

My thinking was we could have a generic config section that supports key-value pairs and just applies them into the config rather than having named keys we have to keep adding to. Similarly an input/filter/output plugin always has common parameters so if we add a new one, e.g. the recent Nightfall one, we can just provide the name for it to be used if there was a common structure for that. Using a new feature would then be changing the image version and adding the config for it to a stable CRD.

Back to the original question which I think was OTLP log format specification. We do have an experimental OTEL input plugin: https://github.com/fluent/fluent-bit/tree/master/plugins/in_opentelemetry It's not built by default though in the official images and currently supports metrics only. There have been some requests on getting log format as well covered so I think this is reasonable to cover @agup006 ?

patrick-stephens avatar Jun 09 '22 10:06 patrick-stephens

and currently supports metrics only

Yea I'm specifically looking for logs so that I can point FluentBit to an OTLP (L=Logs "I think") compatible sink.

Essentially what we're saying though is the CRD is in some fashion coupled to the Fluent Bit version

That's fine. But we can't ignore what happens when we need to make a breaking change to a current version.

My thinking was we could have a generic config section that supports key-value pairs and just applies them into the config rather than having named keys we have to keep adding to.

Are you suggesting that instead of having Output.Spec.Datadog we do Output.Spec.Config = map[string]interface{}{"datadog": ???}. I love the strongly typed config. It's predicable, testable, unmarshalable, etc.

frankgreco avatar Jun 09 '22 17:06 frankgreco

My thinking was we could have a generic config section that supports key-value pairs and just applies them into the config rather than having named keys we have to keep adding to.

Are you suggesting that instead of having Output.Spec.Datadog we do Output.Spec.Config = map[string]interface{}{"datadog": ???}. I love the strongly typed config. It's predicable, testable, unmarshalable, etc.

Instead of creating a new CRD for each plugin, I think we can have something in the middle that is:

  • Keep adding new plugins in the current way to provide a strong type which is predicable
  • Add a new custom plugin mechanism like below to meet new plugin requirements
Output:
  spec:
    custom:
      name: pluginA
      key1: value1
      key2: value2

Or a new CRD called custom can be created to cover increasing plugin requirements:




apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterCustomPlugin
metadata:
  name: plugin1
  labels:
    type: output # or input, filter, parser
    fluentbit.fluent.io/enabled: "false"
    fluentbit.fluent.io/component: logging
spec:
  matchRegex: (?:kube|service)\.(.*)
  params:
    key1: value1
    key2: value2

I count the plugins in fb's official doc, there're nearly 100 plugins there. Maybe the number will exceed k8s CRD count if we add a new CRD for each plugin.

benjaminhuo avatar Jun 10 '22 13:06 benjaminhuo

Processing logs is the advantage of fluentbit and I don't think OpenTelemetry collector can be better. So supporting OTLP logs via fluentbit is exciting just like collecting OTLP metrics and tracing by fluentbit

benjaminhuo avatar Jun 10 '22 14:06 benjaminhuo

Processing logs is the advantage of fluentbit and I don't think OpenTelemetry collector can be better. So supporting OTLP logs via fluentbit is exciting just like collecting OTLP metrics and tracing by fluentbit

Agreed, so I think the request is support for OTEL logs with an input plugin in Fluent Bit - possibly by extending the current OTEL input plugin but I'm not an expert on that so don't want to say for sure. Probably should raise a linked issue then to add that enhancement over there @agup006 ?

patrick-stephens avatar Jun 10 '22 22:06 patrick-stephens

I specifically am looking for an output. For example, if it had OTLP as an output, I could sand FluentBit log to any OTLP compatible sink without FluentBit needing to support it natively.

frankgreco avatar Jun 10 '22 22:06 frankgreco

Ah right, sorry it wasn't clear. An extension of the existing OTEL output to include logs as well as the current metrics (and traces coming): https://docs.fluentbit.io/manual/pipeline/outputs/opentelemetry That's probably an enhancement for the main Fluent Bit repo we link back here. @agup006 though is the man to ask on how that best fits.

CRD updates for using OTEL output though might be a good thing @benjaminhuo which I think was the original request here (apologies for taking things off at a tangent). Currently this will be limited to metrics support only initially but then evolve as the OTEL plugin does.

patrick-stephens avatar Jun 10 '22 22:06 patrick-stephens

That's probably an enhancement for the main Fluent Bit repo we link back here.

Yes, you're right. Opened up here

frankgreco avatar Jun 11 '22 00:06 frankgreco

CRD updates for using OTEL output though might be a good thing @benjaminhuo which I think was the original request here (apologies for taking things off at a tangent). Currently this will be limited to metrics support only initially but then evolve as the OTEL plugin does.

Agree, @wenchajun we can create an issue to add the OpenTelemetry output plugin https://docs.fluentbit.io/manual/pipeline/outputs/opentelemetry

benjaminhuo avatar Jun 12 '22 04:06 benjaminhuo

An issue for adding the OpenTelemetry plugin has been created https://github.com/fluent/fluent-operator/issues/325

benjaminhuo avatar Jun 13 '22 08:06 benjaminhuo

@patrick-stephens @frankgreco @wenchajun @wenchajun @zhu733756 @mangoGoForward

I am think adding customplugin crd like this:

apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterCustomPlugin
metadata:
  name: splunk-output
  labels:
    type: output # or input, filter, parser
    fluentbit.fluent.io/enabled: "true"
  annotations:
    plugin.config: |
      [OUTPUT]
          Name        splunk
          Match       *
          Host        127.0.0.1
          Port        8088
          TLS         On
          TLS.Verify  Off
spec:
  pluginName: splunk
  pluginType: output
  matchRegex: (?:kube|service)\.(.*)

And at the mean time, we keep adding new plugins in the current way to provide a strong type For any emergent requirement, users can use custom plugin crd

benjaminhuo avatar Jun 20 '22 09:06 benjaminhuo

/lgtm

zhu733756 avatar Jun 20 '22 10:06 zhu733756

Looks reasonable to me, would this also enable custom plugins if people build those into their container image? Rather than a named one supported by OSS but some bespoke Golang or similar one making a .so - I guess we would need a way to register it in the config file though:

[PLUGINS]
  path /a/path/libplugin1.so
  path /b/path/libplugin2.so

patrick-stephens avatar Jun 20 '22 10:06 patrick-stephens

[PLUGINS] path /a/path/libplugin1.so path /b/path/libplugin2.so

Where should these plugin paths be added? FluentBit itself has a separate configuration mechanism for this?

benjaminhuo avatar Jun 20 '22 14:06 benjaminhuo

[PLUGINS] path /a/path/libplugin1.so path /b/path/libplugin2.so

Where should these plugin paths be added? FluentBit itself has a separate configuration mechanism for this?

Yup, that is just part of the usual configuration - you can @include it like any other as well. No different to any other config. Anyone using additional Golang output plugins is likely using it.

patrick-stephens avatar Jun 20 '22 14:06 patrick-stephens

I made some changes. Except for the plugin name and plugin type, other parameters are placed under plugin.config. Is this ok? @benjaminhuo

apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterCustomPlugin
metadata:
  namespace: fluent
  name: stdout-output
  labels:
    fluentbit.fluent.io/enabled: "true"
    fluentbit.fluent.io/mode: "fluentbit-only"
  annotations:
    plugin.config: |
      match    *
spec:
  pluginName: stdout
  pluginType: output
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterCustomPlugin
metadata:
  namespace: fluent
  name: mem-input
  labels:
    fluentbit.fluent.io/enabled: "true"
    fluentbit.fluent.io/mode: "fluentbit-only"
  annotations:
    plugin.config: |
      Tag    memory
spec:
  pluginName: mem
  pluginType: input

Gentleelephant avatar Sep 14 '22 03:09 Gentleelephant

I made some changes. Except for the plugin name and plugin type, other parameters are placed under plugin.config. Is this ok? @benjaminhuo

That's ok.

benjaminhuo avatar Sep 14 '22 03:09 benjaminhuo

@benjaminhuo We can define a generic plugin for input, output, filter and use it like a normal plugin.Like this:

apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterInput
metadata:
  namespace: fluent
  name: cpu-input
  labels:
    fluentbit.fluent.io/enabled: "true"
    fluentbit.fluent.io/mode: "fluentbit-only"
spec:
  custom:
    pluginConfig: |
      Name    cpu
      Tag    my_cpu
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterOutput
metadata:
  namespace: fluent
  name: kafka-output
  labels:
    fluentbit.fluent.io/enabled: "true"
    fluentbit.fluent.io/mode: "fluentbit-only"
spec:
  custom:
    pluginConfig: |
      Name      kafka
      Topics     fluentbit
      Match       *
      Brokers     192.168.100.32:9092
      rdkafka.debug All
      rdkafka.request.required.acks 1
      rdkafka.log.connection.close false
      rdkafka.log_level 7
      rdkafka.metadata.broker.list 192.168.100.32:9092

Gentleelephant avatar Sep 16 '22 06:09 Gentleelephant

I think we can change it a little bit like: @Gentleelephant

apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterInput
metadata:
  namespace: fluent
  name: cpu-input
  labels:
    fluentbit.fluent.io/enabled: "true"
    fluentbit.fluent.io/mode: "fluentbit-only"
spec:
  customPlugin:
    Config: |
      Name    cpu
      Tag    my_cpu
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterOutput
metadata:
  namespace: fluent
  name: kafka-output
  labels:
    fluentbit.fluent.io/enabled: "true"
    fluentbit.fluent.io/mode: "fluentbit-only"
spec:
  customPlugin:
    Config: |
      Name      kafka
      Topics     fluentbit
      Match       *
      Brokers     192.168.100.32:9092
      rdkafka.debug All
      rdkafka.request.required.acks 1
      rdkafka.log.connection.close false
      rdkafka.log_level 7
      rdkafka.metadata.broker.list 192.168.100.32:9092

benjaminhuo avatar Sep 16 '22 10:09 benjaminhuo

Thanks to @Gentleelephant , the custom plugin is supported now: https://github.com/fluent/fluent-operator/pull/377 https://github.com/fluent/fluent-operator/blob/master/docs/best-practice/custom-plugin.md

benjaminhuo avatar Sep 20 '22 01:09 benjaminhuo