out_splunk: remove raw endpoint
Fixes #8927. This does not remove the ability to send raw events, i.e. using Splunk_Send_Raw On, but rather sends them to correct endpoint.
Enter [N/A] in the box, if an item is not applicable to your change.
Testing Before we can approve your change; please submit the following in a comment:
- [N/A] Example configuration file for the change
- [N/A] Debug log output from testing the change
- [N/A] Attached Valgrind output that shows no leaks or memory corruption was found
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
- [N/A] Run local packaging test showing all targets (including any new ones) build.
- [N/A] Set
ok-package-testlabel to test for all targets (requires maintainer to do).
Documentation
- [N/A] Documentation required for this feature
Backporting
- [N/A] Backport to latest stable release.
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
is /services/collector/event able to receive raw events ?
@edsiper Could you define what exactly you mean by "raw events"? The term has a different meaning in fluent-bit than in splunk as explained in https://github.com/fluent/fluent-bit/issues/8927#issue-2339984112.
I will double check on this, cannot remember all the details of the raw endpoint and why I implemented on that way at that moment (asking other maintainer to take a look at this too), thank you.
From the Splunk official docs, Fluent Bit needs to add channel parameter as a URL parameter or as a header with x-splunk-request-channel when sending events for a raw endpoint.
Channel This endpoint requires a data channel GUID to differentiate data from different clients. Generate a GUID and provide it in a POST request as a custom HTTP header or as a parameter.
If a channel is not provided in the POST request, an error response is sent. Only valid GUIDs can be used. An error message is returned if GUID validation fails.
ref: https://docs.splunk.com/Documentation/Splunk/9.2.1/RESTREF/RESTinput#services.2Fcollector.2Fraw ref: https://docs.splunk.com/Documentation/Splunk/9.2.1/Data/AboutHECIDXAck#About_channels_and_sending_data
It seems that raw event point can handle JSON type of logs. Because the examples contain JSON case of sending payload.
However, Splunk's documents may complicated in this case. Because without indexer acknowledgement there is not necessity to use channels.
Sending events to HEC with indexer acknowledgment active is similar to sending them with the setting off. There is one crucial difference: when you have indexer acknowledgment turned on, you must specify a channel when you send events.
ref: https://docs.splunk.com/Documentation/Splunk/9.2.1/Data/AboutHECIDXAck#About_channels_and_sending_data
JSON request with timestamp curl https://localhost:8088/services/collector/raw?channel=934793C0-FC91-467E-965A-7EAACEFBC4AB -H 'Authorization: Splunk 934793C0-FC91-467E-965A-7EAACEFBC4AB' -d '{"message":"Hello World", "date":"Wed Aug 10 12:27:53 PDT 2016"}'
If we use only for structured data, we're able to remove raw endpoint from out_splunk. However, I observed that raw endpoint without index acknowledgement can handle raw JSON events via raw endpoint.
Plus, if we remove raw endpoint and no needed to use specifying a raw endpoint, we need to remove splunk_send_raw config map which is defined here: https://github.com/fluent/fluent-bit/blob/master/plugins/out_splunk/splunk.c#L919-L925
If we use only for structured data, we're able to remove raw endpoint from out_splunk. However, I observed that raw endpoint without index acknowledgement can handle raw JSON events via raw endpoint.
Yeah, but if the event endpoint does what we want and we never sent raw strings, there is no point to ever trying to sent something to the raw endpoint. Hence, this PR.
Plus, if we remove raw endpoint and no needed to use specifying a raw endpoint, we need to remove splunk_send_raw config map which is defined here:
master/plugins/out_splunk/splunk.c#L919-L925
That is not what we want. The "raw mode" in fluent-bit means that the record is sent as is to splunk without any processing (except for #8926). If activated, the user is responsible to bring the record into the right format required by splunk, for example by using a Lua filter before it. This behavior is necessary for cases when the configuration options that the out_splunk plugin provides are not sufficient. I'm facing such a use case and thus cannot use Splunk_Send_Raw Off.
When Splunk_Send_Raw Off is configured (default), the whole record is nested under the event key and one can configure other options to be inserted into the JSON data that is being sent to splunk. This is useful for a simple use case.
If we use only for structured data, we're able to remove raw endpoint from out_splunk. However, I observed that raw endpoint without index acknowledgement can handle raw JSON events via raw endpoint.
Yeah, but if the event endpoint does what we want and we never sent raw strings, there is no point to ever trying to sent something to the raw endpoint. Hence, this PR.
Plus, if we remove raw endpoint and no needed to use specifying a raw endpoint, we need to remove splunk_send_raw config map which is defined here:
master/plugins/out_splunk/splunk.c#L919-L925That is not what we want. The "raw mode" in
fluent-bitmeans that the record is sent as is to splunk without any processing (except for #8926). If activated, the user is responsible to bring the record into the right format required by splunk, for example by using a Lua filter before it. This behavior is necessary for cases when the configuration options that the out_splunk plugin provides are not sufficient. I'm facing such a use case and thus cannot useSplunk_Send_Raw Off.When
Splunk_Send_Raw Offis configured (default), the whole record is nested under theeventkey and one can configure other options to be inserted into the JSON data that is being sent to splunk. This is useful for a simple use case.
Ah, I got it. So, using raw endpoint is currently not efficient and inappropriate in fluent-bit. This motivation is what I wanted to know. Really appreciated to describe.
I realized that this change should be reasonable. But, the behavior changes should be described in fluent-bit's documentation properly.
Here is out_splunk's documentation: https://github.com/fluent/fluent-bit-docs/blob/master/pipeline/outputs/splunk.md#sending-raw-events
I also understand what you mean in this PR. This Splunk_Send_Raw is used for sending your modified logs types of events. In some cases as described in documentation, those are intended to behave like Splunk's metrics.
But, the behavior changes should be described in fluent-bit's documentation properly.
The documentation currently doesn't say anything about the endpoint the data is being sent to. I think this is fine given that this is more of an implementation detail of fluent-bit.
As for documenting the change: Splunk_Send_Raw On has not worked since fluent-bit==1.8, when the raw endpoint was introduced. Or maybe it has initially since one might be able to send JSON data (https://github.com/fluent/fluent-bit/pull/9007#issuecomment-2199665102), but it certainly doesn't work on fluent-bit==3.0.7. Meaning, I would treat this as bug fix rather than a feature change.
thanks everybody