trivy-operator Running operator on containerd cuts the logs in client/server mode

What steps did you take and what happened:

I took the last version of trivy-operator and started it with client server mode, which worked out for the most pods, but on random it starts to fail because of cut logs, thus it can not parse them when retrieving them. This doesn't happen when running on dockerd as a CRI. The same can happens when you try to run multiple pods calling the same client command to server. What did you expect to happen: I expect logs not to get cut when running on containerd as CRI Anything else you would like to add: What else can be done is revert trivy image version to 0.29.2 (in values.yaml of helm chart), as in the new versions 0.30.* memory usage is big, which causes to OOM the pods created by the job as the limits aren't changed when new release was created. [Miscellaneous information that will assist in solving the issue.] What probably causes the problem is because having multiple client opens multiple rpc channels, and some of them are getting closed/gc in the middle of transfer of the information.

Trivy-Operator version (use trivy-operator version): latest
Kubernetes version (use kubectl version): 1.21.14
OS (macOS 10.15, Windows 10, Ubuntu 19.10 etc): gardenlinux

Aug 09 '22 08:08 1003n40

@1003n40 thank you for the input can you please add logging or additional info on the failure

Aug 16 '22 08:08 chen-keinan

@chen-keinan could this also be related to kubernetes log rotation configuration?

Aug 22 '22 20:08 josedonizetti

@1003n40 you can change the trivy image tag by setting this value in trivy-operator-trivy-config configMap :

imageRef: ghcr.io/aquasecurity/trivy:0.29.1

Aug 23 '22 05:08 chen-keinan

This issue is stale because it has been labeled with inactivity.

Nov 21 '22 00:11 github-actions[bot]

This issue is stale because it has been labeled with inactivity.

Feb 22 '23 00:02 github-actions[bot]

Hi, I'm not sure if this is related, but we are seeing the same behavior when running trivy client/server in a k3d/k3s cluster. The trivy client runs in a kubernetes Job and sometimes the scan results are cut when we fetch the logs from the Job.

We are only able to reproduce this behavior on GitHub when using the default runners (https://github.com/statnett/image-scanner-operator/actions/runs/4419186007/jobs/7747301776#step:13:381) and when running the tests on an old Mac. And the behavior typically only occurs when the scan results are large.

We suspect that this is related to hardware constraints, since we are only able to reproduce this when running k3d/k3s on machines with limited CPU/memory resources.

Mar 15 '23 10:03 bendikp

The corresponding log on the node looks like this:

2023-03-14T18:54:13.784447539Z stdout F   {
2023-03-14T18:54:13.784566339Z stdout F     "fixedVersion": "5.20.2-3+deb8u11",
2023-03-14T18:54:13.784572939Z stdout F     "installedVersion": "5.20.2-3+deb8u6",
2023-03-14T18:54:13.784576439Z stdout F     "pkgName": "perl",
2023-03-14T18:54:13.784580039Z stdout F     "primaryURL": "https://avd.aquasec.com/nvd/cve-2018-12015",
2023-03-14T18:54:13.784583439Z stdout F     "severity": "HIGH",
2023-03-14T18:54:13.784586939Z stdout F     "title": "perl: Directory traversal in Archive::Tar",
2023-03-14T18:54:13.784591039Z stdout P     "vulnerabilityID": "CVE-2018-120

Where stdout P indicates a partial log entry.

Mar 15 '23 10:03 bendikp

could be related to container log rotation.

workaround : Increase the kubelet default --container-log-max-size

trivy-operator support compression for scan-job log output to avoid this issue

Mar 15 '23 11:03 chen-keinan

Update: after switching to "larger runners" on GitHub we haven't seen this issue.

I don't think it's related to log rotation, as the log file was only ~2MB and there has no second log file available on the node.

Mar 17 '23 13:03 bendikp

@chen-keinan containerd/containerd#7289 It is indeed containerd problem.

Mar 18 '23 15:03 1003n40

This issue is stale because it has been labeled with inactivity.

Jun 28 '23 00:06 github-actions[bot]

trivy-operator trivy-operator copied to clipboard

Running operator on containerd cuts the logs in client/server mode

trivy-operator
trivy-operator copied to clipboard