[capi] Auditd produces too much logs
What steps did you take and what happened:
I built the latest CAPI Ubuntu image (qemu in this case, but this should affect all) and was using it for testing in ClusterAPI provider OpenStack. I saw that auditd produces a lot of logs. For example a simple cluster creation with a control plane node and 5 worker nodes produces about 200 MB of logs on the control plane node alone. I wasn't even running any tests on the cluster.
An example can be seen here (I didn't catch it unfortunately before wasting a lot of space in the test buckets)
What did you expect to happen:
I was expecting to have some audit logs but maybe not that much :)
Anything else you would like to add:
I'm mostly opening this issue to ask if this is intended. I filter out the audit logs in the CAPO e2e tests now and in my production use cases I don't use the upstream auditd config.
If there is a consensus we might configure auditd to produce a bit less logs.
Environment:
Project (Image Builder for Cluster API:
Additional info for Image Builder for Cluster API related issues:
- OS (e.g. from
/etc/os-release, orcmd /c ver):
NAME="Ubuntu"
VERSION="20.04.2 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.2 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
- Cluster-api version (if using): latest ClusterAPI Provider OpenStack but this shouldn't matter
- Kubernetes version: (use
kubectl version): Kubernetes v1.20.4
/kind bug
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue or PR with
/reopen - Mark this issue or PR as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue or PR with
/reopen- Mark this issue or PR as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/reopen /lifecycle frozen
@randomvariable: Reopened this issue.
In response to this:
/reopen /lifecycle frozen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
We have the same issue in CAPA
Looking into this I think there are two options:
- Make the containerd audit rules opt-in for CIS compliance
- Default to masking the systemd-journald-audit.socket so audit logs only go to /var/log/audit/audit.*
2nd option would be the same as on other operating systems?
Might be could to try to keep it consistent.
2nd option would be the same as on other operating systems?
I checked, and in fact, Fedora / RH do still log to journald, the spamming is caused by the extra rules we added for containerd due to the CIS benchmark.
And we have the spam on all OS or are these rules Ubuntu-specific?
Spam is on everything except Flatcar by the looks of the Ansible.
Found this GH issue after observing the same behaviour on our machines. Joining a node into a cluster with a handful of containers getting deployed to it causes more than 1 million lines of entries in syslog during startup, most of which is produced by auditd. I can't be the only one thinking that this is a little extreme for a default configuration.
I forgot to follow up on this and just got reminded when browsing the open issues to see if another problem of mine had already been addressed.
Since we build the CAPI images in our own CI, I implemented a workaround for the audit spam by using the custom_role_names Packer config key that gets passed to the Ansible provisioner to include a role which simply disables that part. I can't copy our exact working setup verbatim because we're basically extending the build container by dropping in some extra files in the right places and a shell script to glue everything together, but the gist of it is something along the lines of the following:
{
// This is a JSON file that you pass to packer by adding its path to the PACKER_VAR_FILES env var
"custom_role": "true",
"custom_role_names": "auditd-please-shut-up-aaaaaaah"
}
# This is an ansible role that you'd want to put in the correct path (in this example, ansible/roles/auditd-please-shut-up-aaaaaaah/tasks/main.yml)
- name: Disable extended audit rules (https://github.com/kubernetes-sigs/image-builder/issues/556)
file:
# https://github.com/kubernetes-sigs/image-builder/blob/f4b84b0c42cf32d3a6bff164a412ea3adfd41915/images/capi/ansible/roles/node/tasks/main.yml#L81
path: /etc/audit/rules.d/containerd.rules
state: absent
export PACKER_VAR_FILES="$PACKER_VAR_FILES packer/auditd-please-shut-up-aaaaaaah-override.json"
make build-<provider>-<image>
Hope this helps.