falco icon indicating copy to clipboard operation
falco copied to clipboard

Introduce the concept of "formatting strategy" for http_output

Open arirubinstein opened this issue 6 months ago • 10 comments

Motivation I have a use case for utilizing the http_output facility to log directly to a Splunk HEC (HTTP Event Collector) endpoint, which requires a JSON container schema with some host information, an authorization header, and including the resulting event. My initial approach was to introduce the concept of a formatting strategy for http_open so a user could simply specify http_output with a formatter of splunk_hec, an additional configuration to allow for http headers to be added for an authorization token, and the HEC endpoint. If this approach sounds like a good one, I'm happy to contribute the code to implement it.

Feature

Introduction of formatting strategies for http_output, including a splunk_hec json container formatter, as well as an arbitrary http header configuration and subsequent injection into the curl options to support the necessary authorization.

Alternatives

While it's possible to utilize another process to re-format the http_output messages, it would be ideal to have this functionality baked into to falco itself to reduce the need for an additional logging

Additional context

This would support use cases like https://github.com/falcosecurity/falco/issues/1346, and somewhat implemented here https://github.com/falcosecurity/falco/issues/1322

arirubinstein avatar May 15 '25 22:05 arirubinstein

Thank you very much @arirubinstein for opening this issue.

Recognizing that the proposal is scoped only to the HTTP output option, I just wanted to share that there may be an opportunity in the future to streamline such capabilities further. In the past, requests have come up to add custom fields.

https://github.com/falcosecurity/falco/issues/3277#issuecomment-2323065729

Perhaps check out this PR as well for additional context on recent changes and general capability expansions in this regard(ish):

https://github.com/falcosecurity/falco/pull/3308/files

I propose awaiting additional community feedback.

incertum avatar May 15 '25 23:05 incertum

The goal would be to mimic the format below where event contains the contents of the Falco json event. For example, this is the HEC format.

{
    "time": 1437522387,
    "host": "dataserver992.example.com",
    "source": "testapp",
    "event": { 
        "message": "Something happened",
        "severity": "INFO"
    }
}

For the purposes of this functionality, append_output would need the facility to address the entire event, or re-structure the format to be able to place the json within a container format. I don't know if that's the best method, but if there are more use-cases for generalized formatters within Falco, then potentially this could utilize such a facility.

arirubinstein avatar May 15 '25 23:05 arirubinstein

As an alternative, have you evaluated the possibility of creating a small program to format the output and perform the HTTP request, then use it with the Falco program_output?

leogr avatar May 21 '25 16:05 leogr

I'm also considering that as well, however resource-wise it would be ideal for our use case to not need to have another process/IPC boundary cross to format the data if possible. Part of the thinking is removing the need to co-distribute a log forwarder to help make Falco easier to install en masse

arirubinstein avatar May 21 '25 18:05 arirubinstein

Hey @arirubinstein

Generally speaking, I'm not 100% convinced that overloading Falco output channels with too many features is a good idea, but I understand your point and see the value.

So let's consider this tentatively for Falco 0.42 /milestone 0.42

Meanwhile, it would be nice to estimate the impact on the codebase, ideally, if it is small, having a PoC.

cc @falcosecurity/core-maintainers wdyt?

leogr avatar May 27 '25 09:05 leogr

@leogr: The provided milestone is not valid for this repository. Milestones in this repository: [0.41.0, 0.42.0, 1.0.0, TBD]

Use /milestone clear to clear the milestone.

In response to this:

Hey @arirubinstein

Generally speaking, I'm not 100% convinced that overloading Falco output channels with too many features is a good idea, but I understand your point and see the value.

So let's consider this tentatively for Falco 0.42 /milestone 0.42

Meanwhile, it would be nice to estimate the impact on the codebase, ideally, if it is small, having a PoC.

cc @falcosecurity/core-maintainers wdyt?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

poiana avatar May 27 '25 09:05 poiana

@leogr: The provided milestone is not valid for this repository. Milestones in this repository: [0.41.0, 0.42.0, 1.0.0, TBD]

Use /milestone clear to clear the milestone.

sorry 😅

/milestone 0.42.0

leogr avatar May 27 '25 10:05 leogr

I have a patch in our fork that introduces the concept of "json_style" to the json_output path with default being the existing implementation, and one for splunk_hec addressable by config. To solve the auth, I also added an authorization header flag to the http_output as well. I'll clean it up and put up a PR in the next day or so to get a discussion started.

arirubinstein avatar May 27 '25 22:05 arirubinstein

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Aug 26 '25 04:08 poiana

/remove-lifecycle stale

arirubinstein avatar Aug 26 '25 06:08 arirubinstein