No blocking by first Pull of an image
Hello, i have a Problem with the Harbor Cache Repo. The Issue is that if a user pull an image from the cache repo the download start instantly. However, I would like the image to be scanned first and then released for the user. This is only a Problem by the first pull of an image the second pull fails if the vulnerability ist to high. I would also have liked to block the first pull
Steps to reproduce the problem: Download an image with an high vulnerability score and check if it blocked by the first pull.
Versions: Please specify the versions of following systems.
- harbor version: v2.13.1
- docker engine version: 27.3.1
- docker-compose version: v2.29.7
In that case you should not use "proxy cache", you should use replication instead.
You can setup a replication policy to regularly pull the image from the remote registry and scan it. The design of "Proxy Cache" aimed to deliver a "proxy" experience that it serves the content as soon as a user requests to pull it.
But how should I implement this? We want to offer Harbor as a cache for dockerhub for a company and I don't know which images the users need and therefore can't add them to the registry beforehand. Is there not an automation for this
I think the problem is how to get the vuln of the image BEFORE it's pulled, and this is out of Harbor's scope. Thinking out loud, to solve your problem, you may write some code to query the info from Docker Hub and implement an admissionhook to block the creation of the pod.
As an enhancement, Harbor may implement similar logic to query the vuln data from upstream before it's proxied, but I don't think it's high priority, b/c there's no standard for that AFAIK.
The intention of the ticket creator is the same, as I have - at least I guess. Harbor should pull the image regardless of its vuln state, then do the scan (synchronous!) and block the download to the requesting client when policies are matched - or pass it through, when no policies apply.
That would be some really neat feature and realize a huge security benefit.
Hence all your workflows would be secured by simply having harbor set before CI/CD or even the Dev-Env... Nothing with any CVE higher than "medium" would be served and any "client" would be safed for potential issues.
Harbor should pull the image regardless of its vuln state, then do the scan (synchronous!) and block the download to the requesting client when policies are matched - or pass it through, when no policies apply.
Before the artifact is pulled to Harbor and scanned Harbor can't serve the content. IMO this breaks the "proxy" experience, and now you already can use the replication policy to do it