Introduce the concept of "formatting strategy" for http_output
Motivation I have a use case for utilizing the http_output facility to log directly to a Splunk HEC (HTTP Event Collector) endpoint, which requires a JSON container schema with some host information, an authorization header, and including the resulting event. My initial approach was to introduce the concept of a formatting strategy for http_open so a user could simply specify http_output with a formatter of splunk_hec, an additional configuration to allow for http headers to be added for an authorization token, and the HEC endpoint. If this approach sounds like a good one, I'm happy to contribute the code to implement it.
Feature
Introduction of formatting strategies for http_output, including a splunk_hec json container formatter, as well as an arbitrary http header configuration and subsequent injection into the curl options to support the necessary authorization.
Alternatives
While it's possible to utilize another process to re-format the http_output messages, it would be ideal to have this functionality baked into to falco itself to reduce the need for an additional logging
Additional context
This would support use cases like https://github.com/falcosecurity/falco/issues/1346, and somewhat implemented here https://github.com/falcosecurity/falco/issues/1322
Thank you very much @arirubinstein for opening this issue.
Recognizing that the proposal is scoped only to the HTTP output option, I just wanted to share that there may be an opportunity in the future to streamline such capabilities further. In the past, requests have come up to add custom fields.
https://github.com/falcosecurity/falco/issues/3277#issuecomment-2323065729
Perhaps check out this PR as well for additional context on recent changes and general capability expansions in this regard(ish):
https://github.com/falcosecurity/falco/pull/3308/files
I propose awaiting additional community feedback.
The goal would be to mimic the format below where event contains the contents of the Falco json event. For example, this is the HEC format.
{
"time": 1437522387,
"host": "dataserver992.example.com",
"source": "testapp",
"event": {
"message": "Something happened",
"severity": "INFO"
}
}
For the purposes of this functionality, append_output would need the facility to address the entire event, or re-structure the format to be able to place the json within a container format. I don't know if that's the best method, but if there are more use-cases for generalized formatters within Falco, then potentially this could utilize such a facility.
As an alternative, have you evaluated the possibility of creating a small program to format the output and perform the HTTP request, then use it with the Falco program_output?
I'm also considering that as well, however resource-wise it would be ideal for our use case to not need to have another process/IPC boundary cross to format the data if possible. Part of the thinking is removing the need to co-distribute a log forwarder to help make Falco easier to install en masse
Hey @arirubinstein
Generally speaking, I'm not 100% convinced that overloading Falco output channels with too many features is a good idea, but I understand your point and see the value.
So let's consider this tentatively for Falco 0.42 /milestone 0.42
Meanwhile, it would be nice to estimate the impact on the codebase, ideally, if it is small, having a PoC.
cc @falcosecurity/core-maintainers wdyt?
@leogr: The provided milestone is not valid for this repository. Milestones in this repository: [0.41.0, 0.42.0, 1.0.0, TBD]
Use /milestone clear to clear the milestone.
In response to this:
Hey @arirubinstein
Generally speaking, I'm not 100% convinced that overloading Falco output channels with too many features is a good idea, but I understand your point and see the value.
So let's consider this tentatively for Falco 0.42 /milestone 0.42
Meanwhile, it would be nice to estimate the impact on the codebase, ideally, if it is small, having a PoC.
cc @falcosecurity/core-maintainers wdyt?
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
@leogr: The provided milestone is not valid for this repository. Milestones in this repository: [
0.41.0,0.42.0,1.0.0,TBD]Use
/milestone clearto clear the milestone.
sorry 😅
/milestone 0.42.0
I have a patch in our fork that introduces the concept of "json_style" to the json_output path with default being the existing implementation, and one for splunk_hec addressable by config. To solve the auth, I also added an authorization header flag to the http_output as well. I'll clean it up and put up a PR in the next day or so to get a discussion started.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
/remove-lifecycle stale