trivy icon indicating copy to clipboard operation
trivy copied to clipboard

Context deadline exceeded

Open mtcolman opened this issue 2 years ago • 1 comments

Description

I'm trying to scan my cluster (trivy k8s --report summary cluster) and so far two attempts have failed (I'm trying increases in the timeout value...)

I can't work out from the "FATAL" message whether it is providing this message because of the timeout, or whether the timeout is being caused by whatever causes the FATAL error.

Could the error output be more informative, i.e. "this scan has failed because it hit the timeout limit before successfully scanning all items" or something like that?

What did you expect to happen?

I expect the cluster resources to be scanned.

What happened instead?

I then receive the following messages:

WARN    Increase --timeout value
FATAL   k8s scan error: scanning misconfigurations error: scan error: image scan failed: failed analysis: failed to call hooks: post handler error: scan config error: context deadline exceeded

Output of run with -debug:

$ trivy k8s --debug --report summary cluster
2022-07-27T10:17:17.180+0100    DEBUG   Severities: ["UNKNOWN" "LOW" "MEDIUM" "HIGH" "CRITICAL"]
2022-07-27T10:17:21.356+0100    DEBUG   cache dir:  /home/matt/.cache/trivy
2022-07-27T10:17:21.356+0100    DEBUG   DB update was skipped because the local DB is the latest
2022-07-27T10:17:21.356+0100    DEBUG   DB Schema: 2, UpdatedAt: 2022-07-27 06:07:47.221501092 +0000 UTC, NextUpdate: 2022-07-27 12:07:47.221500892 +0000 UTC, DownloadedAt: 2022-07-27 09:00:33.9436114 +0000 UTC
91 / 1722 [------>_______________________________________________________________________________________________________________________] 5.28% 0 p/s
2022-07-27T10:22:21.359+0100    WARN    Increase --timeout value
2022-07-27T10:22:21.360+0100    FATAL   k8s scan error:
    github.com/aquasecurity/trivy/pkg/k8s/commands.run
        /home/runner/work/trivy/trivy/pkg/k8s/commands/run.go:72
  - scanning misconfigurations error:
    github.com/aquasecurity/trivy/pkg/k8s/scanner.(*Scanner).Scan
        /home/runner/work/trivy/trivy/pkg/k8s/scanner/scanner.go:72
  - scan error:
    github.com/aquasecurity/trivy/pkg/commands/artifact.(*runner).scanArtifact
        /home/runner/work/trivy/trivy/pkg/commands/artifact/run.go:227
  - image scan failed:
    github.com/aquasecurity/trivy/pkg/commands/artifact.scan
        /home/runner/work/trivy/trivy/pkg/commands/artifact/run.go:531
  - failed analysis:
    github.com/aquasecurity/trivy/pkg/scanner.Scanner.ScanArtifact
        /home/runner/work/trivy/trivy/pkg/scanner/scan.go:127
  - failed to call hooks:
    github.com/aquasecurity/trivy/pkg/fanal/artifact/local.Artifact.Inspect
        /home/runner/work/trivy/trivy/pkg/fanal/artifact/local/fs.go:127
  - post handler error:
    github.com/aquasecurity/trivy/pkg/fanal/handler.Manager.PostHandle
        /home/runner/work/trivy/trivy/pkg/fanal/handler/handler.go:75
  - scan config error:
    github.com/aquasecurity/trivy/pkg/fanal/handler/misconf.misconfPostHandler.Handle
        /home/runner/work/trivy/trivy/pkg/fanal/handler/misconf/misconf.go:244
  - context deadline exceeded

Output of trivy -v:

$ trivy -v
Version: 0.30.4
Vulnerability DB:
  Version: 2
  UpdatedAt: 2022-07-27 06:07:47.221501092 +0000 UTC
  NextUpdate: 2022-07-27 12:07:47.221500892 +0000 UTC
  DownloadedAt: 2022-07-27 09:00:33.9436114 +0000 UTC

Additional details (base image name, container registry info...):

mtcolman avatar Jul 27 '22 09:07 mtcolman

I subsequetly ran the scan with a 30m timeout and it completed (in just short of 20mins), here is the debug output:

$ trivy k8s --debug --timeout 30m0s --report summary cluster
2022-07-27T10:27:46.422+0100    DEBUG   Severities: ["UNKNOWN" "LOW" "MEDIUM" "HIGH" "CRITICAL"]
2022-07-27T10:27:50.901+0100    DEBUG   cache dir:  /home/matt/.cache/trivy
2022-07-27T10:27:50.901+0100    DEBUG   DB update was skipped because the local DB is the latest
2022-07-27T10:27:50.901+0100    DEBUG   DB Schema: 2, UpdatedAt: 2022-07-27 06:07:47.221501092 +0000 UTC, NextUpdate: 2022-07-27 12:07:47.221500892 +0000 UTC, DownloadedAt: 2022-07-27 09:00:33.9436114 +0000 UTC
1722 / 1722 [--------------------------------------------------------------------------------------------------------------------------] 100.00% 2 p/s
2022-07-27T10:45:14.153+0100    ERROR   Error during vulnerabilities scan: scan error: unable to initialize a scanner: unable to initialize a docker scanner: 4 errors occurred:
        * unable to inspect the image (registry.aquasec.com/database:2022.4): Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
        * unable to initialize Podman client: no podman socket found: stat podman/podman.sock: no such file or directory
        * containerd socket not found: /run/containerd/containerd.sock
        * GET https://registry.aquasec.com/v2/database/manifests/2022.4: unexpected status code 401 Unauthorized: <html>
<head><title>401 Authorization Required</title></head>
<body>
<center><h1>401 Authorization Required</h1></center>
<hr><center>openresty/1.19.9.1</center>
</body>
</html>

I'm wondering if the "unable to inspect the image" as it can't find docker or podman should be raised as a separate ticket here? Shouldn't this be a check it does up front and immediately alert me prior to the scan running for 20 mins? (i.e. I now need to make docker/podman available and rerun).

mtcolman avatar Jul 27 '22 10:07 mtcolman

@mtcolman It seems to me this issue here can be closed, because you were able to scan the cluster once you set the timeout to 30m, correct? And perhaps the issue of not been able to scan the image registry.aquasec.com/database:2022.4 be raised in a separate issue, because it isn't related?

josedonizetti avatar Aug 22 '22 23:08 josedonizetti

I think this is not issue with that particular image. I am seeing similar problem with the images in AWS ECR. The problem does not occur when one of the following circumstances are met:

  • number of images to scan is low because of scanning single namespace
  • the per namespace scan was run first for some of the namespaces and the cache is available. It is not necessary to run the per namespace scan for all namespaces using private repository.
  • I run docker login first so that trivy will be able to use this to get images

But when cache is clear, number of images to scan high, then trivy k8s --report summary cluster will have no problem accessing external repository (private images in ECR in my case) for some images and will throw 401 for some random other images. All of the images with 401 Unauthorized error will be scanned correctly by trivy if instead of scanning whole cluster I will scan single namespace.

So @josedonizetti maybe there is some problem in a mechanism used to utilize access to remote, private repositories in the case when the volume of such traffic is high or if the scanning takes more than certain time limit? In my case there are scanning takes between 15 and 35 minutes and the timeout parameter in trivy is set to 60m0s.

I am running my tests on Ubuntu 22.04 on trivy 0.31.2. I had seen the same problem or at least problem with the same effect of incomplete report in the trivy 0.28.1

piotr-janek avatar Sep 02 '22 10:09 piotr-janek

This issue is stale because it has been labeled with inactivity.

github-actions[bot] avatar Mar 09 '23 00:03 github-actions[bot]