amazon-eks-ami icon indicating copy to clipboard operation
amazon-eks-ami copied to clipboard

Set eventRecordQPS to 0

Open Pluies opened this issue 5 years ago • 9 comments

*Issue

#99

Description of changes:

See CIS Benchmark 4.2.9

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Pluies avatar Jan 08 '20 17:01 Pluies

I had mentioned in the original PR https://github.com/awslabs/amazon-eks-ami/pull/347#discussion_r346680052 whether or not doing this is actually reasonable, considering that setting this to 0 effectively disables rate limiting of events, can someone with some more knowledge on the subject comment?

stefansedich avatar Jan 21 '20 19:01 stefansedich

@stefansedich for reference and discussion, this is the relevant section in the CIS benchmark:

4.2.9 Ensure that the --event-qps argument is set to 0 or a level which ensures appropriate event capture (Not Scored)

Profile Applicability: Level 2

Description: Security relevant information should be captured. The --event-qps flag on the Kubelet can be used to limit the rate at which events are gathered. Setting this too low could result in relevant events not being logged, however the unlimited setting of 0 could result in a denial of service on the kubelet.

Rationale: It is important to capture all events and not restrict event creation. Events are an important source of security information and analytics that ensure that your environment is consistently monitored using the event data. Audit: Run the following command on each node: ps -ef | grep kubelet Review the value set for the --event-qps argument and determine whether this has been set to an appropriate level for the cluster. The value of 0 can be used to ensure that all events are captured. If the --event-qps argument does not exist, check that there is a Kubelet config file specified by --config and review the value in this location.

Remediation: If using a Kubelet config file, edit the file to set eventRecordQPS: to an appropriate level. If using command line arguments, edit the kubelet service file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf on each worker node and set the below parameter in KUBELET_SYSTEM_PODS_ARGS variable. Based on your system, restart the kubelet service. For example: systemctl daemon-reload systemctl restart kubelet.service

Impact: Setting this parameter to 0 could result in a denial of service condition due to excessive events being created. The cluster's event processing and storage systems should be scaled to handle expected event loads.

Default Value: By default, --event-qps argument is set to 5.

References:

  1. https://kubernetes.io/docs/admin/kubelet/
  2. https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/kubel etconfig/v1beta1/types.go

Pluies avatar Jan 22 '20 11:01 Pluies

In my opinion, the tradeoff is between "you may lose security events" and "you may slow down your kubelet because of load", which I'd personally rate as respectively Very Bad and Bad.

I think it makes sense to err on the side of caution and follow the CIS recommendation here, but I'll happily defer to people with more expertise 👍

Pluies avatar Jan 22 '20 11:01 Pluies

Any update on this? :) I understand it warranted an internal discussion but it's been a few months now.

zmarouf avatar Aug 06 '20 07:08 zmarouf

Would also love to hear an update on this. I see MS Azure AKS set it explicitly to 0. I also see recommendations like 12-kubernetes-configuration-best-practices and GKE cis-benchmarks suggesting to set it to 0.

eldada avatar Sep 16 '20 07:09 eldada

Wondering about this too. Has anyone manually set it to 0 and experienced significant impact? Or could CloudWatch be configured to store those events permanently?

tomhuang12 avatar Sep 28 '20 17:09 tomhuang12

I just wanted to highlight a section on one of the links shared in a comment above from @eldada which I think it's key to understand the implications of the value of this variable https://github.com/awslabs/amazon-eks-ami/pull/391#issuecomment-693230869

Events are Kubernetes objects stored in etcd. To avoid overwhelming etcd they are only kept for one hour, and are not an appropriate security auditing mechanism. Allowing unlimited events as suggested in this control exposes the cluster to unnecessary DoS risk and contradicts the recommendation to use admission EventRateLimits. Security relevant events that need permanent storage should be sent to logs.

source: https://cloud.google.com/anthos/gke/docs/on-prem/concepts/cis-benchmarks

zot24 avatar Oct 23 '20 19:10 zot24

This is still showing as an exception on our pen-tests.

Could we please get an answer on whether setting eventRecordQPS to 0 will cause problems, or whether mitigation is planned in a future release?

What is the advise from AWS?

technotaff-nbs avatar Feb 11 '21 12:02 technotaff-nbs

I notice that the --event-qps advice is causing a lot of confusion and potential misconfiguration.

The advice is referring to the deprecated command line option, where 0 is documented to use the default value of 5, whereas in the config file, a value of 0 is documented to disable rate limiting, which is precisely what this rule is supposed to prevent.

https://github.com/kubernetes/kubernetes/blob/master/cmd/kubelet/app/options/options.go

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/config/v1beta1/defaults.go

joebowbeer avatar May 03 '22 05:05 joebowbeer

@stefansedich https://github.com/awslabs/amazon-eks-ami/pull/391/files#r973550647

@technotaff-nbs My advice is to fix the CIS Benchmarks - which I am trying to do, by the way.

joebowbeer avatar Sep 17 '22 06:09 joebowbeer

@joebowbeer thanks for clearing this up; I'm in agreement that the CIS spec needs updating and we shouldn't make this change.

cartermckinnon avatar Sep 30 '22 18:09 cartermckinnon