fluent-bit
fluent-bit copied to clipboard
Support dynamic input for building up OLTP Resource Attributes (possible Lua use case?)
For context, consider a host running:
- Docker
- configured with the Fluentd logging driver
- multiple containerized services
- Fluent Bit
- Enriches the log with contextual information of the local environment (eg container_id)
- full pipeline configuration at the end
Of interest are the containerized services producing logs with log4j/logback to stdout.
Here is an example of data flowing from service to docker and through the Fluent Bit pipeline and where the area of opportunity is. Initial log message produced by a service:
{
"@timestamp": "2025-06-13T17:23:19.876365974-06:00",
"@version": "1",
"message": "latency is 3ms",
"logger": "org.home4s.lutron.leap.LutronLEAPStream",
"thread": "io-compute-0",
"severity": "INFO",
"level_value": 20000,
"home4s.bridge": "Lutron",
"service.name": "home4s",
"service.version": "0.10",
"service.namespace": "homelab"
}
Docker Fluentd driver restructures and decorates the log:
{
"log": "{\"@timestamp\":\"2025-06-03T04:34:16.179429011Z\",\"@version\":\"1\",\"message\":\"latency is 3ms\",\"logger\":\"org.home4s.lutron.leap.LutronLEAPStream\",\"thread\":\"io-compute-0\",\"severity\":\"INFO\",\"level_value\":20000}",
"container_id": "3286f4562f95063a7ead6b3ca46c895c2eabded97e1d81c5ed52fd545fd5828e",
"container_name": "/home4s",
"source": "stdout"
}
In the Fluent Bit pipeline use a JSON parser as processor with Key_Name "log" to parse the Event, along with Reserve_Data: true to maintain container metadata (modify to remove source).
{
"@timestamp": "2025-06-13T17:23:19.876365974-06:00",
"@version": "1",
"message": "latency is 3ms",
"logger": "org.home4s.lutron.leap.LutronLEAPStream",
"thread": "io-compute-0",
"severity": "INFO",
"level_value": 20000,
"home4s.bridge": "Lutron",
"service.name": "home4s",
"service.version": "0.10",
"service.namespace": "homelab",
"container_id": "3286f4562f95063a7ead6b3ca46c895c2eabded97e1d81c5ed52fd545fd5828e",
"container_name": "/home4s"
}
Next use the opentelemetry_envelope processor to restructure into OTLP structure along with the content_modifier to populate the OTLP Resource Attributes:
- name: opentelemetry_envelope
- name: content_modifier
context: otel_resource_attributes
action: upsert
key: service.name
value: event.attributes.service.name # <===== THIS IS THE PROBLEM
and then finally out through the opentelemetry output, resulting in the following (Fluent format):
Jun 14 21:08:18 home4s fluent-bit[3898584]: GROUP METADATA :
Jun 14 21:08:18 home4s fluent-bit[3898584]: {"schema"=>"otlp", "resource_id"=>0, "scope_id"=>0}
Jun 14 21:08:18 home4s fluent-bit[3898584]: GROUP ATTRIBUTES :
Jun 14 21:08:18 home4s fluent-bit[3898584]: {"resource"=>{"attributes"=>{"service.name"=>"event.attributes.service.name"}}, "scope"=>{}}
Jun 14 21:08:18 home4s fluent-bit[3898584]: [25] home4s:0.10: [[1749935298.081045245, {}], {"@timestamp"=>"2025-06-13T17:23:19.876365974-06:00", "@version"=>"1", "message"=>"latency is 3ms", "logger"=>"org.home4s.lutron.leap.LutronLEAPStream", "thread"=>"io-compute-1", "severity"=>"INFO", "level_value"=>20000, "home4s.bridge"=>"Lutron", "service.name"=>"home4s", "service.version"=>"0.10", "service.namespace"=>"homelab", "container_id"=>"3286f4562f95063a7ead6b3ca46c895c2eabded97e1d81c5ed52fd545fd5828e", "container_name"=>"/home4s"}]
The feature request centres around allowing support for a dynamic lookup for the value of content_modifier, in this case, specifically from the Log Attributes (supplied in the original log).
Another possible approach would be to create a Tag for each service in the docker logging driver config and a corresponding content_modifier + match for each one.
This is not ideal because the configuration needs to manually adjusted every time a service is added or removed. More ideally, Fluent Bit could be configured once and dynamically scale as containerized services are added and removed. Allowing dynamic lookups in the value for content_modifier should make this possible.
Finally just a question, I never see any metadata in the output records, how is metadata populated in the pipeline?
complete pipeline:
service:
flush: 1
log_level: info
parsers_file: parsers.conf
pipeline:
inputs:
- name: forward
listen: 0.0.0.0
port: 24224
processors:
logs:
- name: parser
match: '*'
parser: log4j_json
key_name: log
reserve_data: 'true' # keep the docker log driver fields like 'container_id', 'container_name'
- name: modify
match: '*'
Remove: source # .. but remove 'source', it's always 'stdout'
- name: opentelemetry_envelope
- name: content_modifier
context: otel_resource_attributes
action: upsert
key: service.name
value: log_record.attributes.service.name # <===== THIS IS THE PROBLEM
outputs:
- name: opentelemetry
match: '*'
host: otel-collector.saxonmt.casa
port: 443
tls: 'On'
log_response_payload: true
logs_body_key: message
logs_severity_text_message_key: severity
logs_severity_number_message_key: level_value
logs_body_key_attributes: true
- name: stdout
# format: json_lines
match: '*'