falco
falco copied to clipboard
Falco Webhook getting an error - "http: request body too large"
Describe the bug
How to reproduce it
- Install using the normal way to use k8s-audit. I used the official helm charts
Expected behaviour
In my lab everything works perfecly because I dont have a large environment but in my production I am facing the error about the body is too large then I had increased the 2 parameters to works correctly
maxEventSize: 134217728
webhookMaxBatchSize: 268435456
Then the POD memory increased to 38gb or more and the cores either, so I would like to know it is a bug or not.
My environment is too large but is so weird because I tested other applications and works around 12gb.
I would like to fix the error or if I did something wrong please help me on it.
Screenshots
Environment
- Falco version:
Thu Sep 14 15:17:25 2023: Falco version: 0.35.1 (x86_64) Thu Sep 14 15:17:25 2023: Falco initialized with configuration file: /etc/falco/falco.yaml {"default_driver_version":"5.0.1+driver","driver_api_version":"4.0.0","driver_schema_version":"2.0.0","engine_version":"17","falco_version":"0.35.1","libs_version":"0.11.3","plugin_api_version":"3.0.0"}
- System info:
{ "machine": "x86_64", "nodename": "falco-auditing-56bdb4c9b6-5wbjr", "release": "4.18.0-348.el8.0.2.x86_64", "sysname": "Linux", "version": "#1 SMP Sun Nov 14 00:51:12 UTC 2021" }
- Cloud provider or hardware configuration:
- OS: Redhat 8.5
- Kernel:
4.18.0-348.el8.0.2.x86_64
- Installation method:
Officinal Helm Charts on https://github.com/falcosecurity/charts
Additional context
2023/09/14 15:10:45 [k8saudit] bad request: http: request body too large
2023/09/14 15:10:57 [k8saudit] bad request: http: request body too large
2023/09/14 15:10:59 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:00 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:04 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:05 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:14 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:16 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:21 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:23 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:26 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:35 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:35 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:40 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:43 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:44 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:51 [k8saudit] bad request: http: request body too large
2023/09/14 15:11:56 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:00 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:01 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:03 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:12 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:14 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:16 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:17 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:22 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:31 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:32 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:36 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:39 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:43 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:49 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:54 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:55 [k8saudit] bad request: http: request body too large
2023/09/14 15:12:58 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:07 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:12 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:14 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:14 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:17 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:29 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:34 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:35 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:38 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:40 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:51 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:52 [k8saudit] bad request: http: request body too large
2023/09/14 15:13:58 [k8saudit] bad request: http: request body too large
2023/09/14 15:14:01 [k8saudit] bad request: http: request body too large
2023/09/14 15:14:04 [k8saudit] bad request: http: request body too large
2023/09/14 15:14:13 [k8saudit] bad request: http: request body too large
2023/09/14 15:14:14 [k8saudit] bad request: http: request body too large
ei thank you for reporting!
Then the POD memory increased to 38gb or more and the cores either, so I would like to know it is a bug or not.
Uhm it seems like a bug, we need to investigate more on this!
Thank you so much @Andreagit97. I will send below a screenshoot from real query in Prometheus. I included a resources limits to 42gb but if remove those limits it will be reach out more than 120gb
Start the auditing and in 3 minutes the memory reach out 22gb
Thank you for the additional data, right now we are a little bit busy but we will come to it after the falco release!
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
Not fixed
/remove-lifecycle stale
You increased max eventsize to 134Gb and max webhook batch size to 268Gb? In which case the memory usage is sort of expected I guess, as up to 268GB of json has to be processed at once...
A few things you might experiment with:
- limit the number of events in a single batch by setting the
--audit-webhook-batch-max-sizeflag on your api server, you might need multiple falco instances to keep up with your audit event stream, as you mention having a large cluster - use the falco tailored
audit-policy.yaml(docs) in case you are not already doing so, as the api server can generate massive amounts of audit events which are not all relevant to falco - as some events include the
requestObject(e.g. a ConfigMap), you might be able to find the event which includes some massive k8s object and consider dropping it using theaudit-policy.yaml
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Provide feedback via https://github.com/falcosecurity/community. /close
@poiana: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with
/reopen.Mark the issue as fresh with
/remove-lifecycle rotten.Provide feedback via https://github.com/falcosecurity/community. /close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.