trivy-operator icon indicating copy to clipboard operation
trivy-operator copied to clipboard

Possibility to Exclude Jobs/cronjobs from scanning

Open VF-mbrauer opened this issue 2 years ago • 12 comments

Since jobs/cronjobs represent transient workloads it would be possible to skip/exclude them as it is already possible with complete namespaces. So any configuration option would help here to stop frequent scans in case the frequency for a job is set to a minimum.

@erikgb, @chen-keinan: cc

VF-mbrauer avatar Jul 07 '22 18:07 VF-mbrauer

@VF-mbrauer Is the idea here to exclude the scanning for all cronjobs? Because if you have a cronjob running every minute for example, it should only be scanned once, right? The image/config wouldn't change, so for a specific cronjob we would have only one scan unless it was changed.

josedonizetti avatar Jul 20 '22 12:07 josedonizetti

If we were to add a feature to achieve this, I think it should be more generic - allowing you to exclude one or more workload resources from scanning.

erikgb avatar Jul 20 '22 13:07 erikgb

@erikgb exactly, we spoke about it at some point, of using annotations to ignore specific scans. That is why I'm trying to understand what was the idea proposed here.

josedonizetti avatar Jul 20 '22 13:07 josedonizetti

@erikgb exactly, we spoke about it at some point, of using annotations to ignore specific scans. That is why I'm trying to understand what was the idea proposed here.

@josedonizetti Annotations? On the workload itself? That would be another use-case IMO. I think what @VF-mbrauer is asking for, is a feature on the operator config level.

erikgb avatar Jul 20 '22 13:07 erikgb

@erikgb Yeah, but did you understand exactly the feature/reason? A configuration at the operator level to do what? Exclude all cronjobs/jobs scan? For what reason? Frequency of scan? (like for the same job? or because there are too many jobs?)

josedonizetti avatar Jul 20 '22 13:07 josedonizetti

@erikgb Yeah, but did you understand exactly the feature/reason? A configuration at the operator level to do what? Exclude all cronjobs/jobs scan? For what reason? Frequency of scan? (like for the same cronjob? or because there are too many jobs?)

Well, @VF-mbrauer should answer that. But I think it would be a nice feature to have anyway. Let's say your clusters spins up a lot of jobs from an external controller (not cronjobs). It would be nice to have the opportunity to exclude the scans? The jobs would typically use the same image, and the way trivy-operator is designed, it cannot handle it.... Scans by image (and not workload) could mend this. Together with spec/status. But we are far from there....

erikgb avatar Jul 20 '22 13:07 erikgb

Hi @josedonizetti, we need to adjust this issue in his descriptions a bit so that it explains not the complete skip Cronjobs, etc. but furthermore, it should be detected that even if the cronjob runs every 1 minute and if the Image behind has changed. If the image is still the same, it should be excluded from re-scanning. In case it has changed in the meantime the image or Image tag should be part of the scan.

Because it makes no sense to re-scan an image for example 2000 times if it has not changed at all.

The scanner should just scan in case

  • Image has been changed or Tag has been changed.
  • TTL kicks in for a regular check of an existing image.

VF-mbrauer avatar Aug 18 '22 19:08 VF-mbrauer

@VF-mbrauer AH! That makes sense. So, for example, if someone is only updating a deployment configuration, without changing an image, we would not rescan it, given image/tag haven't changed. Correct?

If so, I'll change the description/title, because currently it is misleading what the issue is asking.

josedonizetti avatar Aug 22 '22 23:08 josedonizetti

@josedonizetti

if someone is only updating a deployment configuration, without changing an image, we would not rescan it, given image/tag haven't changed. Correct?

Yes, that is correct.

But I think that should not differ that much from the process of how deployment manifests are handled already nowadays. The concept works anyway that way. Until something has been changed in deployment it will be just the TTL that will reinitiate the re-scan of an image. Otherwise, if somebody changes the Deployment, it will get also triggered.

The base of this idea is the same under the hood. Looking for a change and then act. The difference here is, that we need to check how can we make it better integrated with jobs which do their thing frequently.

VF-mbrauer avatar Aug 23 '22 06:08 VF-mbrauer

@josedonizetti Do you have any news on this one?

VF-mbrauer avatar Sep 11 '22 16:09 VF-mbrauer

Since jobs/cronjobs represent transient workloads it would be possible to skip/exclude them as it is already possible with complete namespaces. So any configuration option would help here to stop frequent scans in case the frequency for a job is set to a minimum.

Faced a similar issue when tried to scan a particular cluster which had 15.5k non-running jobs. Consequently, the JSON report size was 2GB with all repetition.

nilesh-akhade avatar Sep 16 '22 17:09 nilesh-akhade

Since jobs/cronjobs represent transient workloads it would be possible to skip/exclude them as it is already possible with complete namespaces. So any configuration option would help here to stop frequent scans in case the frequency for a job is set to a minimum.

Faced a similar issue when tried to scan a particular cluster which had 15.5k non-running jobs. Consequently, the JSON report size was 2GB with all repetition.

This was also very similar to my issue. I modified the operator to allow me to choose which workload resources to scan and it has been working well to alleviate things. I opened a PR linked above in case more folks have this same use-case.

tks98 avatar Sep 16 '22 19:09 tks98

I have seen a similar behavior where we have multiple VulnerabilityReports from jobs that are not running anymore. I don't think a report should be created for a Workload that is not currently running/active.

status:
  completionTime: "2022-07-18T22:38:16Z"
  conditions:
  - lastProbeTime: "2022-07-18T22:38:16Z"
    lastTransitionTime: "2022-07-18T22:38:16Z"
    status: "True"
    type: Complete
  startTime: "2022-07-18T22:37:56Z"
  succeeded: 1

fhielpos avatar Dec 01 '22 13:12 fhielpos

@fhielpos in prev. trivy-operator release v0.7.0 we have introduce new capability to skip workload/resource scanning by label let me know if it solve your use case

chen-keinan avatar Dec 01 '22 13:12 chen-keinan

Thanks for pointing it out. For now, this would fix it since jobs wouldn't be scanned at all. However, I think that when jobs are enabled in the targetWorkload setting, they should only be scanned when active. Then, once the TTL is reached, that report should be deleted if the job is completed.

fhielpos avatar Dec 01 '22 23:12 fhielpos