fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

out_opentelemetry: support metadata key properties

Open nokute78 opened this issue 1 year ago • 8 comments

This patch is to support several metadata key properties to get a OTLP values from metadata. See also https://github.com/fluent/fluent-bit/pull/8294 and https://github.com/fluent/fluent-bit/issues/8205

key Description Default
logs_observed_timestamp_metadata_key Specify an ObservedTimestamp key to look up in the metadata. ObservedKey
logs_timestamp_metadata_key Specify an Timestamp key to look up in the metadata. Timestamp
logs_severity_key_metadata_key Specify an SeverityText key to look up in the metadata. SeverityText
logs_severity_number_metadata_key Specify an SeverityNumber key to look up in the metadata. SeverityNumber
logs_trace_flags_metadata_key Specify an Flags key to look up in the metadata. Flags
logs_span_id_metadata_key Specify an SpanId key to look up in the metadata. SpanId
logs_trace_id_metadata_key Specify an TraceId key to look up in the metadata. TraceId
logs_attributes_metadata_key Specify an Attributes key to look up in the metadata. Attributes
  • [X] Timestamp
  • [X] ObservedTimestamp
  • [X] TraceId
  • [X] SpanId
  • [X] TraceFlags
  • [X] SeverityText
  • [X] SeverityNumber
  • [ ] Resource
  • [ ] InstrumentationScope
  • [X] Attributes

Enter [N/A] in the box, if an item is not applicable to your change.

Testing Before we can approve your change; please submit the following in a comment:

  • [X] Example configuration file for the change
  • [X] Debug log output from testing the change
  • [X] Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [ ] Documentation required for this feature

Backporting

  • [N/A] Backport to latest stable release.

Configuration

[INPUT]
    NAME dummy
    samples 1
    Dummy {"log":"aaaa"}
    Metadata {"ObservedTimestamp":1234, "Timestamp":3456, "SeverityText":"WARN", "SeverityNumber":5, "TraceFlags":13, "Attributes":{"log.file.name":"a.log"}}

[OUTPUT]
    Name   stdout
    Match *

[OUTPUT]
    Name opentelemetry
    Host localhost
    Port 6969
    Match *

Config for otelcol-contrib:

receivers:
  otlp:
    protocols:
      http:
        endpoint: localhost:6969

exporters:
  file:
    path: 2_out.json

service:
  telemetry:
    logs:
      level: debug
  pipelines:
    logs:
      receivers: [otlp]
      exporters: [file]

Debug/Valgrind output

[2024/02/11 12:41:01] [ info] [fluent bit] version=3.0.0, commit=9652b0dc6f, pid=61472
[2024/02/11 12:41:01] [ info] [storage] ver=1.2.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/02/11 12:41:01] [ info] [cmetrics] version=0.6.6
[2024/02/11 12:41:01] [ info] [ctraces ] version=0.4.0
[2024/02/11 12:41:01] [ info] [input:dummy:dummy.0] initializing
[2024/02/11 12:41:01] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2024/02/11 12:41:02] [ info] [output:stdout:stdout.0] worker #0 started
[2024/02/11 12:41:02] [ info] [sp] stream processor started
[0] dummy.0: [[1707622862.439011464, {"ObservedTimestamp"=>1234, "Timestamp"=>3456, "SeverityText"=>"WARN", "SeverityNumber"=>5, "TraceFlags"=>13, "Attributes"=>{"log.file.name"=>"a.log"}}], {"log"=>"aaaa"}]
[2024/02/11 12:41:03] [ info] [output:opentelemetry:opentelemetry.1] localhost:6969, HTTP status=200


^C[2024/02/11 12:41:05] [engine] caught signal (SIGINT)
[2024/02/11 12:41:05] [ warn] [engine] service will shutdown in max 5 seconds
[2024/02/11 12:41:05] [ info] [input] pausing dummy.0
[2024/02/11 12:41:05] [ info] [engine] service has stopped (0 pending tasks)
[2024/02/11 12:41:05] [ info] [input] pausing dummy.0
[2024/02/11 12:41:05] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2024/02/11 12:41:05] [ info] [output:stdout:stdout.0] thread worker #0 stopped
==61472== 
==61472== HEAP SUMMARY:
==61472==     in use at exit: 0 bytes in 0 blocks
==61472==   total heap usage: 2,236 allocs, 2,236 frees, 1,002,111 bytes allocated
==61472== 
==61472== All heap blocks were freed -- no leaks are possible
==61472== 
==61472== For lists of detected and suppressed errors, rerun with: -s

Output of otelcol-contrib:

{
  "resourceLogs": [
    {
      "resource": {},
      "scopeLogs": [
        {
          "scope": {},
          "logRecords": [
            {
              "timeUnixNano": "3456",
              "observedTimeUnixNano": "1234",
              "severityNumber": 5,
              "severityText": "WARN",
              "body": {
                "kvlistValue": {
                  "values": [
                    {
                      "key": "log",
                      "value": {
                        "stringValue": "aaaa"
                      }
                    }
                  ]
                }
              },
              "attributes": [
                {
                  "key": "log.file.name",
                  "value": {
                    "stringValue": "a.log"
                  }
                }
              ],
              "flags": 13,
              "traceId": "",
              "spanId": ""
            }
          ]
        }
      ]
    }
  ]
}

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

nokute78 avatar Feb 11 '24 03:02 nokute78

thanks for working on this PR @nokute78

some feedback:

  • I would suggest to remove the word metadata from the new config properties, as well from the internal variables: logs_span_id_metadata_keylogs_span_id_key.
  • For new properties, similar to what is being done in #8491, use record accessor pattern, so the end user gets the flexibility to use any subkey of the record.
  • The default values for those record accessors, should be the ones expected by the data model, e.g: the default for the expected TraceId should be $TraceId.
  • Attributes handling will be handled differently as described in #8491, so it won't be needed for now. For custom attributes, we will handle it differently through a processor that modifies the original log metadata.

edsiper avatar Feb 16 '24 13:02 edsiper

@edsiper Thank you for feedback.

I would suggest to remove the word metadata from the new config properties

I think following values should be specified.

  • Logs/Metrics/Traces
  • Field of logs
    • https://opentelemetry.io/docs/specs/otel/logs/data-model/#log-and-event-record-definition
  • Metadata/Message
    • https://docs.fluentbit.io/manual/concepts/key-concepts#event-format

I sent a patch to store OTLP data as a metadata. https://github.com/fluent/fluent-bit/pull/8294. These properties for the patch. On the other hand, user may use a value of message. e.g. OTEL supports SeverityText and SeverityNumber and user may want to use a value from in_syslog. https://opentelemetry.io/docs/specs/otel/logs/data-model-appendix/

So I added metadata to properties to clearly distinguish metadata and message.

Or it may be better to support properties to seek a value from metadata AND message. e.g. If user set logs_attributes_key, out_otel lookup event.metadata and event.body.

use record accessor pattern, so the end user gets the flexibility to use any subkey of the record.

I think so. I'll modify this patch.

nokute78 avatar Feb 16 '24 23:02 nokute78

@edsiper

Attributes handling will be handled differently as described in #8491, so it won't be needed for now.

If I wrong please correct me. I think #8491 will not fix #8205 since it converts other metadata to Attributes.

This PR is to fix https://github.com/fluent/fluent-bit/issues/8205. It is to support forwarding OTLP. I sent a patch https://github.com/fluent/fluent-bit/pull/8294 and it creates following metadata for forwarding.

{"ObservedTimestamp":1706312272306657295, "Attributes":{"log.file.name":"a.log"}, "TraceFlags":0, "Resource":{"host.name":"taka-VirtualBox", "os.type":"linux"}, "InstrumentationScope":{"Name":"test_scope", "Version":"1"}}

https://github.com/fluent/fluent-bit/pull/8491 will store all above metadata as Attributes of OTLP. It means fluent-bit changes original Attributes and not forward original OTLP fields.

I tested following configuration and https://github.com/fluent/fluent-bit/pull/8491 patch. Please see https://github.com/fluent/fluent-bit/pull/8491#issuecomment-1949602016

nokute78 avatar Feb 17 '24 00:02 nokute78

Hi @nokute78 ,

#8491 aims to solve only body and attributes that are set as random keys from the record. Yes, I agree that PR is not to solve everything.

If we let the concept of forwarding and in_opentelemetry in the side for a moment, I think this PR should purely focus on the discovery and setting of the keys you described above that are part from the data model:

key Description Default
logs_observed_timestamp_key Specify an ObservedTimestamp key to look up in the metadata. ObservedKey
logs_timestamp_key Specify an Timestamp key to look up in the metadata. Timestamp
logs_severity_key_key Specify an SeverityText key to look up in the metadata. SeverityText
logs_severity_number_key Specify an SeverityNumber key to look up in the metadata. SeverityNumber
logs_trace_flags_key Specify an Flags key to look up in the metadata. Flags
logs_span_id_key Specify an SpanId key to look up in the metadata. SpanId
logs_trace_id_key Specify an TraceId key to look up in the metadata. TraceId

I skipped the logs_attributes_key since it must have a separate handling.

Since the urgency is to solve any structured message to OTLP first, do you think you can adjust the imlementation for the config above ? (by using record accessor as suggested?), on that way we can solve this primary use case, once community confirm is ok we can focus on forwarding in_opentelemetryout_opentelemetry.

let me know what do you think, thanks

edsiper avatar Feb 18 '24 02:02 edsiper

@edsiper Thank you for comment.

I skipped the logs_attributes_key since it must have a separate handling.

Could you tell the detail ? Why are Attributes special ?

https://opentelemetry.io/docs/specs/otel/logs/data-model/#field-attributes

SHOULD follow OpenTelemetry semantic conventions for attributes. This field is optional.

Attributes is an optional field. It should be valid if Attributes is empty. And spec requests to follow OTEL semantic conventions. I think #8491 can be against the semantic conventions.

the urgency is to solve any structured message to OTLP first

In my opinion, out_opentelemetry should be

  • Default value of optional fields are empty.
    • Note: almost of fields are optional. Attributes is one of them.
    • Please see https://opentelemetry.io/docs/specs/otel/logs/data-model/#log-and-event-record-definition
  • logs_body_key is to scrape off Body and don't touch Attributes.
  • logs_AAA_metadata_key of this PR to specify a value of optional field. If the value is found, fluent-bit set it as a value of optional field.
    • e.g. User specify logs_attributes_metadata_key and the value is found, fluent-bit overwrites Attributes of OTLP.
  • It may be better to set default value of optional field.
    • e.g. A property default_trace_id and user can set the default trace_id.

logs_observed_timestamp_metadata_key

By the way, this metadata means a METADATA of fluent-bit event. That is not for MESSAGE a.k.a record. https://docs.fluentbit.io/manual/concepts/key-concepts#event-format

[[TIMESTAMP, METADATA], MESSAGE]
             ^^^^^^^^

nokute78 avatar Feb 18 '24 08:02 nokute78

@nokute78 I have updated #8491, now it does not auto-populate attributes with remaining keys unless logs_body_key_attributes is enabled (disabled by default). Note that PR already take care of log metadata and set them as attributes. So I think this PR should focus in the other keys as mentioned.

edsiper avatar Feb 18 '24 20:02 edsiper

@edsiper #8491 is for MESSAGE and this patch is for METADATA. They are different feature.

By the way I think it is better that the default value of logs_*_metadata_key is NULL to disable by default.

nokute78 avatar Feb 23 '24 01:02 nokute78

@braydonk Could you check this PR ? This patch is to set a value for OTLP from metadata.

As you know https://github.com/fluent/fluent-bit/pull/8294 is to store OTLP field as metadata of fluent-bit event. This patch is to read metadata and set these value as fields of OTLP. It is to forward OTLP using fluent-bit. https://github.com/fluent/fluent-bit/issues/8205

nokute78 avatar Feb 23 '24 01:02 nokute78

@nokute78 @edsiper When will this be merged? And which release will it be part of?

cb645j avatar Mar 04 '24 20:03 cb645j

@nokute78 can you please check @braydonk comments so we can get this merged

cb645j avatar Mar 05 '24 19:03 cb645j

@braydonk @edsiper @nokute78 how can we get this completed do we can merge. Fluent bit will not work correctly with opentelemtry until this is merged. My company is blocked by these fluentbit otel issues.

cb645j avatar Mar 08 '24 17:03 cb645j

@braydonk can you approve please

cb645j avatar Mar 11 '24 18:03 cb645j

My approval won't have any effect on moving this PR forward as I'm not a maintainer and my review wasn't directly requested, but I can press the button if it helps.

braydonk avatar Mar 11 '24 18:03 braydonk

My approval won't have any effect on moving this PR forward as I'm not a maintainer and my review wasn't directly requested, but I can press the button if it helps.

I see, i guess we need someone with the magic touch

cb645j avatar Mar 11 '24 18:03 cb645j

@edsiper @koleini @fujimotos @leonardo-albertovich Can you please approve/review so we can move this forward.

cb645j avatar Mar 11 '24 18:03 cb645j

@cb645j This patch is to get a value from METADATA and set as a value of OTLP. It is not for MESSAGE. https://docs.fluentbit.io/manual/concepts/key-concepts#event-format

There are some points to discuss.

  1. @edsiper requested removing a property for Attributes. logs_attributes_metadata_key. https://github.com/fluent/fluent-bit/pull/8475#issuecomment-1950775005 I think it is needed.

  2. @edsiper requested removing a metadata from property names. https://github.com/fluent/fluent-bit/pull/8475#issuecomment-1950775005 I think it is needed to clarify that the property is for METADATA not for MESSAGE. It will be explicit if a new property for MESSAGE will be supported.

  3. Default value of properties. https://github.com/fluent/fluent-bit/pull/8475#issuecomment-1960594081 Some users don't need this feature. It may be better that the default value is NULL to disable this feature as a default. On the other hand, setting all these values can be troublesome.

  4. The properties for Resource and InstrumentationScope are not implemented now.

nokute78 avatar Mar 11 '24 23:03 nokute78

@nokute78 I see what your saying now. Can we not change this to get this data from the message or support both like you suggested in one of your comments. So the user can specify either logs_trace_id_message_key or logs_trace_id_metadata_key, then based on that the out_otel will get the trace id either from metadata or message depending on which key property you have set....... Im not sure how you even set the metadata.... All my log events have a blank metadata but im NOT using the in_otel, im using tail to read logs from a file... So my log events look like this

[55] appTag: [[1710185880.635000000, {}], {"timestamp"=>"2024-03-11 14:38:00.635", "loglevel"=>"INFO", "trace_id"=>"95e1d11ece6460e7d00c61d45cc195ff", "span_id"=>"11aafe22712ca02c", "message"=>"Hello world app is up and running"}]

cb645j avatar Mar 12 '24 16:03 cb645j

@cb645j Thank you for information. Currently this patch can't solve your issue.

It may be better to support properties for MESSAGE like logs_trace_id_message_key you suggested. But It needs a lot of properties and it is complicated..

nokute78 avatar Mar 13 '24 12:03 nokute78

@nokute78 I will open up a separate PR

https://github.com/fluent/fluent-bit/pull/8644

@nokute78 can you post an example opentelemetry output configuration where you actually set these new properties (logs_span_id_metadata_key, etc...)

cb645j avatar Mar 13 '24 16:03 cb645j

@nokute78 @edsiper I tested this and the trace id and span id fields do not work as strings. The other fields work but trace id and span id dont. It looks like @nokute78 test above did not cover testing these fields. In his metadat example test above he left these out. This is what he had.

Metadata {"ObservedTimestamp":1234, "Timestamp":3456, "SeverityText":"WARN", "SeverityNumber":5, "TraceFlags":13, "Attributes":{"log.file.name":"a.log"}

If you add in TraceId and SpanId to his example above, those fields dont get set still. You can test with this and see.

Metadata {"ObservedTimestamp":1234, "Timestamp":3456, "TraceId": "95e1d11ece6460e7d00c61d45cc195ff", "SeverityText":"WARN", "SeverityNumber":5, "TraceFlags":13, "Attributes":{"log.file.name":"a.log"}

So you have merged code that does not fully work

cb645j avatar Mar 18 '24 19:03 cb645j

@cb645j The type of TraceId and SpanId are byte sequence. String will be ignored since the data type is different.

https://opentelemetry.io/docs/specs/otel/logs/data-model/#field-traceid https://opentelemetry.io/docs/specs/otel/logs/data-model/#field-spanid

nokute78 avatar Mar 23 '24 01:03 nokute78

@nokute78 thanks, yes i figured that out. For when getting from message of log event, since the traceid and spanid are strings in this case, i had to convert to byte arrays. Please see my PR and make any suggestions or comments. i was able to get it working for the 4 fields i added (span-id, trace-id, sev text, and sev number).

https://github.com/fluent/fluent-bit/pull/8644

cb645j avatar Mar 26 '24 20:03 cb645j