falco Falco Webhook getting an error - "http: request body too large"

Describe the bug

How to reproduce it

Install using the normal way to use k8s-audit. I used the official helm charts

Expected behaviour

In my lab everything works perfecly because I dont have a large environment but in my production I am facing the error about the body is too large then I had increased the 2 parameters to works correctly

maxEventSize: 134217728
webhookMaxBatchSize: 268435456

Then the POD memory increased to 38gb or more and the cores either, so I would like to know it is a bug or not.

My environment is too large but is so weird because I tested other applications and works around 12gb.

I would like to fix the error or if I did something wrong please help me on it.

Screenshots

Environment

Falco version:

Thu Sep 14 15:17:25 2023: Falco version: 0.35.1 (x86_64) Thu Sep 14 15:17:25 2023: Falco initialized with configuration file: /etc/falco/falco.yaml {"default_driver_version":"5.0.1+driver","driver_api_version":"4.0.0","driver_schema_version":"2.0.0","engine_version":"17","falco_version":"0.35.1","libs_version":"0.11.3","plugin_api_version":"3.0.0"}

System info:

{ "machine": "x86_64", "nodename": "falco-auditing-56bdb4c9b6-5wbjr", "release": "4.18.0-348.el8.0.2.x86_64", "sysname": "Linux", "version": "#1 SMP Sun Nov 14 00:51:12 UTC 2021" }

Cloud provider or hardware configuration:
OS: Redhat 8.5

Kernel:

4.18.0-348.el8.0.2.x86_64

Installation method:

Officinal Helm Charts on https://github.com/falcosecurity/charts

Additional context

  2023/09/14 15:10:45 [k8saudit] bad request: http: request body too large

 2023/09/14 15:10:57 [k8saudit] bad request: http: request body too large

 2023/09/14 15:10:59 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:00 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:04 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:05 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:14 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:16 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:21 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:23 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:26 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:35 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:35 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:40 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:43 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:44 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:51 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:56 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:00 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:01 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:03 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:12 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:14 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:16 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:17 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:22 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:31 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:32 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:36 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:39 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:43 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:49 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:54 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:55 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:58 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:07 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:12 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:14 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:14 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:17 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:29 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:34 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:35 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:38 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:40 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:51 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:52 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:58 [k8saudit] bad request: http: request body too large

 2023/09/14 15:14:01 [k8saudit] bad request: http: request body too large

 2023/09/14 15:14:04 [k8saudit] bad request: http: request body too large

 2023/09/14 15:14:13 [k8saudit] bad request: http: request body too large

 2023/09/14 15:14:14 [k8saudit] bad request: http: request body too large

Sep 14 '23 15:09 antikilahdjs

ei thank you for reporting!

Then the POD memory increased to 38gb or more and the cores either, so I would like to know it is a bug or not.

Uhm it seems like a bug, we need to investigate more on this!

Sep 15 '23 13:09 Andreagit97

Thank you so much @Andreagit97. I will send below a screenshoot from real query in Prometheus. I included a resources limits to 42gb but if remove those limits it will be reach out more than 120gb

Start the auditing and in 3 minutes the memory reach out 22gb

Sep 15 '23 14:09 antikilahdjs

Thank you for the additional data, right now we are a little bit busy but we will come to it after the falco release!

Sep 15 '23 15:09 Andreagit97

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Dec 24 '23 15:12 poiana

Not fixed

Dec 29 '23 01:12 antikilahdjs

/remove-lifecycle stale

Jan 03 '24 13:01 Andreagit97

You increased max eventsize to 134Gb and max webhook batch size to 268Gb? In which case the memory usage is sort of expected I guess, as up to 268GB of json has to be processed at once...

A few things you might experiment with:

limit the number of events in a single batch by setting the --audit-webhook-batch-max-size flag on your api server, you might need multiple falco instances to keep up with your audit event stream, as you mention having a large cluster
use the falco tailored audit-policy.yaml (docs) in case you are not already doing so, as the api server can generate massive amounts of audit events which are not all relevant to falco
as some events include the requestObject (e.g. a ConfigMap), you might be able to find the event which includes some massive k8s object and consider dropping it using the audit-policy.yaml

Feb 14 '24 15:02 sboschman

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

May 14 '24 15:05 poiana

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

Jun 29 '24 15:06 poiana

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community. /close

Jul 29 '24 16:07 poiana

@poiana: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community. /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Jul 29 '24 16:07 poiana

falco falco copied to clipboard

Falco Webhook getting an error - "http: request body too large"

falco
falco copied to clipboard