refactor(k8s): scan config files as a folder
Description
Trivy kubernetes scan tries to process the k8s configs in parallel. but this leads to large overhead, due to initialize the scanner for each file separately (it needs to read rego policies).
This PR stores all k8s config in the same folder and scan this one at once. it gives a boost about 9 times on my test minikube instance.
Before
$ time trivy k8s --report all --skip-images -f json -o result.json
2024-10-10T10:12:01+06:00 INFO Node scanning is enabled
2024-10-10T10:12:01+06:00 INFO If you want to disable Node scanning via an in-cluster Job, please try '--disable-node-collector' to disable the Node-Collector job.
235 / 235 [-----------------------------------------------------------------------------------------------------------------] 100.00% 3 p/s
trivy k8s --report all --skip-images -f json -o result.json 147,50s user 8,63s system 148% cpu 1:45,12 total
After
$ time ./tr k8s --report all --skip-images -f json -o result.json
2024-10-10T17:04:04+06:00 INFO Node scanning is enabled
2024-10-10T17:04:04+06:00 INFO If you want to disable Node scanning via an in-cluster Job, please try '--disable-node-collector' to disable the Node-Collector job.
./tr k8s --report all --skip-images -f json -o result.json 16,83s user 1,12s system 85% cpu 21,002 total
Related issues
- Close #7662
- Close #7684
Checklist
- [x] I've read the guidelines for contributing to this repository.
- [x] I've followed the conventions in the PR title.
- [ ] I've added tests that prove my fix is effective or that my feature works.
- [ ] I've updated the documentation with the relevant information (if needed).
- [ ] I've added usage information (if the PR introduces new options)
- [x] I've included a "before" and "after" example to the description (if the PR is a user interface change).
@itaysk @knqyf263 Could you take a look at this PR when you have time?
I'm unsure about logs. maybe it's a bit confusing for people.
also I can't reproduce #7684 with this update.
wdyt?
thanks a lot for your time.
I'm unsure about logs. maybe it's a bit confusing for people.
cluster scanning can have many resources, and logging each individual resource scan is too much IMO. we can make it a debug level log.
also I can't reproduce https://github.com/aquasecurity/trivy/issues/7684 with this update.
Good, that means you think it solved #7684 too?
I'm unsure about logs. maybe it's a bit confusing for people. cluster scanning can have many resources, and logging each individual resource scan is too much IMO. we can make it a debug level log.
Thanks! I understood your point and agree with them. so I remove the log update from this PR, and will create a new one for logs only.
also I can't reproduce #7684 with this update. Good, that means you think it solved #7684 too?
I'm unsure, because I don't understand a reason of #7684. I'm investigating it now
@itaysk @knqyf263 could you take a look at this PR again? thanks!
@itaysk I finally clarify the reason of #7684 - the race condition appears between a few misconfig scanners, so this PR should resolve it too.
@afdesk Left a couple small comments
lgtm, but I'll wait for @knqyf263 to take a look as well.
@simar7 thanks for your review!!
LGTM
@knqyf263 thanks for your review.
when tests are completed, I'll add this PR to merge queue, ok?
when tests are completed, I'll add this PR to merge queue, ok?
Sure. Just FYI: There are many places where it still says "folder". https://github.com/aquasecurity/trivy/pull/7690/commits/84cacc5dc531c09ccce443271b6f0f37bcf05b58