harbor icon indicating copy to clipboard operation
harbor copied to clipboard

Harbor 2.3: Replication of existing images containing CVEs blocked by the destination project will fail on manifest check early

Open sizowie opened this issue 3 years ago • 11 comments

Expected behavior and actual behavior: A pull replication of an image (or multiple images) containing critical CVEs (could be any other severity, depending on destinations project settings) fails when it's already present in the destination project that prevents pulls on images with critical CVEs. It seems that harbor internals trying to find out if the image is present locally in the destination project before replication is triggered but it fails on own security settings. This is a behavior I didn't see in our current set up with 2.2.1 and I guess this is not working as designed. Also: source registry/projects allows to fetch critical CVE images.

Steps to reproduce the problem:

  • Create a project (project-a)
  • Set "Automatically scan images on push", "Prevent vulnerable images from running." and "Prevent images with vulnerability severity of Critical and above from being deployed." in project-a's configuration
  • Set up a pull replication of an image with a critical CVE to project-a
  • Trigger replication
  • First attempt will succeed (image is placed, scanned locally, results will contain critical CVE)
  • Next replication attempt will fail early on trying to check if the image is locally present
  • Unset the security settings (see above) -> Replication works

Versions:

  • harbor version: 2.3.0
  • docker engine version: Docker version 20.10.7, build f0df350
  • docker-compose version: docker-compose version 1.28.2, build 67630359

Additional context:

==> /data/harbor-log/jobservice.log <==
Jul  7 20:59:18 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:18Z [INFO] [/jobservice/worker/cworker/c_worker.go:76]: Job incoming: {"name":"REPLICATION","id":"d6f9020ad6bf4ebc7b8c7bbe","t":1625684358,"args":null}
Jul  7 20:59:18 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:18Z [INFO] [/pkg/config/rest/rest.go:53]: get configuration from url: http://core:8080/api/v2.0/internalconfig
Jul  7 20:59:18 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:18Z [INFO] [/pkg/config/rest/rest.go:53]: get configuration from url: http://core:8080/api/v2.0/internalconfig

==> /data/harbor-log/core.log <==
Jul  7 20:59:18 192.168.13.1 core[5614]: 2021-07-07T18:59:18Z [INFO] [/pkg/notifier/notifier.go:205]: Handle notification with Handler 'ReplicationWebhook' on topic 'REPLICATION': ReplicationTaskID-158 Status-Running OccurAt-2021-07-07 18:59:18

==> /data/harbor-log/jobservice.log <==
Jul  7 20:59:18 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:18Z [INFO] [/controller/replication/transfer/image/transfer.go:124]: client for source registry [type: harbor, URL: https://registry.dev, insecure: false] created
Jul  7 20:59:18 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:18Z [INFO] [/controller/replication/transfer/image/transfer.go:134]: client for destination registry [type: harbor, URL: http://core:8080, insecure: true] created
Jul  7 20:59:18 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:18Z [INFO] [/controller/replication/transfer/image/transfer.go:167]: copying library/postgres:[10](source registry) to project-a/postgres:[10](destination registry)...
Jul  7 20:59:18 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:18Z [INFO] [/controller/replication/transfer/image/transfer.go:191]: copying library/postgres:10(source registry) to project-a/postgres:10(destination registry)...
Jul  7 20:59:18 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:18Z [INFO] [/controller/replication/transfer/image/transfer.go:333]: pulling the manifest of artifact library/postgres:10 ...
Jul  7 20:59:19 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:19Z [INFO] [/controller/replication/transfer/image/transfer.go:339]: the manifest of artifact library/postgres:10 pulled
Jul  7 20:59:19 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:19Z [ERROR] [/controller/replication/transfer/image/transfer.go:347]: failed to check the existence of the manifest of artifact project-a/postgres:10 on the destination registry: http status code: 412, body:
Jul  7 20:59:19 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:19Z [ERROR] [/controller/replication/transfer/image/transfer.go:175]: http status code: 412, body:
Jul  7 20:59:19 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:19Z [ERROR] [/controller/replication/transfer/image/transfer.go:181]: got error during the whole transfer period, mark the job failure
Jul  7 20:59:19 192.168.13.1 jobservice[5614]: 2021-07-07T18:59:19Z [ERROR] [/jobservice/runner/redis.go:113]: Job 'REPLICATION:d6f9020ad6bf4ebc7b8c7bbe' exit with error: run error: got error during the whole transfer period, mark the job failure

==> /data/harbor-log/core.log <==
Jul  7 20:59:19 192.168.13.1 core[5614]: 2021-07-07T18:59:19Z [INFO] [/pkg/notifier/notifier.go:205]: Handle notification with Handler 'ReplicationWebhook' on topic 'REPLICATION': ReplicationTaskID-158 Status-Error OccurAt-2021-07-07 18:59:19

sizowie avatar Jul 07 '21 19:07 sizowie

It looks like this behavior was introduced with commit https://github.com/goharbor/harbor/commit/ea35e7b9eccec5d34255edc04c9054d771d4fb90

sizowie avatar Jul 08 '21 19:07 sizowie

@bitsf @ywk253100 can you confirm this is a bug?

sizowie avatar Jul 14 '21 15:07 sizowie

Could you please investigate? This issue is preventing us from upgrading our 2.2 to 2.3.

sizowie avatar Aug 30 '21 13:08 sizowie

Issue exists in v2.3.2 also.

sizowie avatar Aug 31 '21 10:08 sizowie

@ninjadq @wy65701436 @bitsf any update on this? we've recently upgrade to v2.3.2 in our test environment and hit this issue (#15560)

dkulchinsky avatar Sep 14 '21 22:09 dkulchinsky

the problem is we add HEAD request when replication, which meet 412 error. we will check this if can workaround this, like push forcely

bitsf avatar Sep 16 '21 03:09 bitsf

We cannot just ignore the 412 error and force push because this may cause a replication deadlock if two Harbor instances replicate to each other.

We may leverage the artifact API to check the existence of the manifest before pushing, but the manifest checking logic is also used by proxy cache which needs the size of the manifest, while the artifact API doesn't return that.

So we need to find out a better way to resolve this issue, @wy65701436 suggests that we can add a new blob API to return the size of the manifest.

Will do more investigation and move the issue into 2.3.4 and 2.4.1

ywk253100 avatar Sep 24 '21 07:09 ywk253100

Any updates on this issue?

sizowie avatar Oct 29 '21 16:10 sizowie

Is there a workaround we can use in the meanwhile? Otherwise replication with prevent vuln. enabled is broken completely/ not usable.

sizowie avatar Nov 16 '21 17:11 sizowie

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Jul 05 '22 11:07 github-actions[bot]

Issue still exists - do not close.

sizowie avatar Jul 05 '22 11:07 sizowie

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Nov 01 '22 09:11 github-actions[bot]

This issue is still relevant, please do not close.

sizowie avatar Nov 01 '22 09:11 sizowie

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Jan 01 '23 09:01 github-actions[bot]

This issue is still relevant, please do not close.

sizowie avatar Jan 01 '23 09:01 sizowie

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Mar 03 '23 09:03 github-actions[bot]

Issue still exists (also in newer releases, 2.6) - do not close.

sizowie avatar Mar 03 '23 11:03 sizowie

Issue still exists in 2.8.0 - please don't close this issue (and add appropriate tags if necessary)

bitbull06 avatar May 09 '23 10:05 bitbull06

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Jul 28 '23 09:07 github-actions[bot]

Still an issue. Do not close.

sizowie avatar Jul 28 '23 09:07 sizowie

Any updates or workarounds for this? Have the same issue on v2.8.1

NikolaDimic avatar Sep 21 '23 11:09 NikolaDimic

Any updates or workarounds for this? Have the same issue on v2.8.1

We reverted the commit that introduced the issue (https://github.com/goharbor/harbor/commit/ea35e7b9eccec5d34255edc04c9054d771d4fb90) and build it from source.

Unfortunately this is afaik the only way to "fix" it. We tested it since 2.3 -> 2.8.4 w/o problems.

sizowie avatar Sep 21 '23 11:09 sizowie

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Nov 21 '23 09:11 github-actions[bot]

Still an issue with latest release (2.9). Please do not close.

sizowie avatar Nov 21 '23 12:11 sizowie

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Jan 22 '24 09:01 github-actions[bot]

Do not close.

sizowie avatar Jan 22 '24 09:01 sizowie

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Mar 23 '24 09:03 github-actions[bot]

Still an issue. Please do not close.

sizowie avatar Mar 23 '24 09:03 sizowie