harbor icon indicating copy to clipboard operation
harbor copied to clipboard

Inconsistent Behavior When Triggering Trivy Scan via Harbor API

Open kon-foo opened this issue 1 year ago • 3 comments

Expected behavior and actual behavior: Expected behavior: When sending a request to initialize a Trivy scan of an artifact through the Harbor API, I expect the scan to consistently either succeed or fail, as the configuration of the artifact and Trivy scanner do not change between requests.

Actual behavior: Some attempts succeed, while others return a 400 Bad Request error with the following message:

The configured scanner Trivy does not support scanning artifact with mime type application/vnd.docker.distribution.manifest.v2+json

While the inconsistency might be an issue with our configuration and environment, at least the error message cant be correct. Screenshots showing both successful and failed requests are attached to this issue.

Steps to reproduce the problem:

  1. Set up Harbor registry (v2.10.2) and Trivy (goharbor/trivy-adapter-photon:v2 .10.2) in a Kubernetes cluster.
  2. Attempt to initialize a scan of an artifact via the Harbor API by sending a POST request to the /scan endpoint.
  3. Observe the inconsistent behavior.

Versions:

  • Harbor version: v2.10.2-1a741cb7 (From Image goharbor/registry-photon:v2.10.2)
  • Trivy Image goharbor/trivy-adapter-photon:v2.10.2
  • Kubenetes: v1.30.3

Additional context:

409 Response

harborapi_scan_400_blurred

202 Response

harborapi_scan_202_blured

kon-foo avatar Sep 23 '24 09:09 kon-foo

@kon-foo Could you please reproduce the issue (both success and failure in scan) and collect the logs of nginx, harbor-core, harbor-jobservice and trivy-adapter pods?

Could you please also let me know how Harbor was deployed in your env? What makes you feel it may be an issue with your configuration and environment?

reasonerjt avatar Sep 30 '24 05:09 reasonerjt

@reasonerjt Thanks for looking into this. Harbor was deployed using this helm chart. These are the images in use:

Component Image
harbor-core goharbor/harbor-core:v2.10.2
harbor-database goharbor/harbor-db:v2.10.2
harbor-jobservice goharbor/harbor-jobservice:v2.10.2
harbor-portal goharbor/harbor-portal:v2.10.2
harbor-redis goharbor/redis-photon:v2.10.2
harbor-registry goharbor/registry-photon:v2.10.2 & goharbor/harbor-registryctl:v2.10.2
harbor-trivy goharbor/trivy-adapter-photon:v2.10.2

Here are the logs:

This time I actually had to hit the API ~40 times before getting a 202. Core fails to ping the scanner 39 times:

2024-10-01T05:34:15Z [ERROR] [/controller/scanner/base_controller.go:299][error="v1 client: get metadata: Get "http://release-registry-harbor-trivy:8080/api/v1/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" requestID="27fa31aade1ae4e2a9e422a198fd0544"]: failed to ping scanner
2024-10-01T05:34:15Z [ERROR] [/controller/scanner/base_controller.go:265]: api controller: get project scanner: scanner controller: ping: v1 client: get metadata: Get "http://release-registry-harbor-trivy:8080/api/v1/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Before finally succeeding:

2024-10-01T05:35:45Z [INFO] [/server/middleware/security/robot.go:71][requestID="9553b0d0-4924-4218-93fc-e4d8891358f3"]: a robot security context generated for request GET /service/token
2024-10-01T05:35:53Z [INFO] [/pkg/task/dao/execution.go:471]: scanned out 1 executions with outdate status, refresh status to db
2024-10-01T05:35:53Z [INFO] [/pkg/task/dao/execution.go:512]: refresh outdate execution status done, 1 succeed, 0 failed

What makes you feel it may be an issue with your configuration and environment?

I just added this to emphasize that even if it had something to do with our conf/env, I would consider this unwanted behavior, because the mime-type is not the problem and the error message is misleading. While it wasn't me who deployed harbor in our cluster, I am unaware of any unusual configurations, but the failing scanner pings make me feel like it could be a networking or permissions misconfiguration.

Thanks for your help and let me know if you need further information.

kon-foo avatar Oct 01 '24 06:10 kon-foo

I had some time to dig deeper and was able to locate the issue. First of all the "inconsistency" stems from our trivy container sometimes not answering within the hardcoded 5s timeout of the rest clients. So that's not on Harbor. The actual bug here is how these timeouts are handled in GetRegistrationByProject and Scan

	if opts.Ping {
		// Get metadata of the configured registration
		meta, err := bc.Ping(ctx, registration)
		if err != nil {
			// Not blocked, just logged it
			log.Error(errors.Wrap(err, "api controller: get project scanner"))
			registration.Health = statusUnhealthy
		} else {
                  ...
		}
	}
	return registration, nil

In case of an error in bc.Ping it just gets logged, the registration is marked unhealthy and is returned with an empty Metadata. The Scan method however only checks for the existence of a registration and not for its healthiness and therefore proceeds to compare the Artifacts mime-type against an empty Metadata object, throwing a confusing error.

Fix

Assuming there is a reason for only setting the registration.Health = statusUnhealthy instead of throwing the error. The easy fix would be to check for the healthiness in Scan. I could create a PR for that if you want me to @reasonerjt?

However, it might be worth it to reconsider if swallowing this error is a good idea and to check if all the calls of GetRegistrationByProject handle registration.Health correctly. I am not familiar enough with harbor or go to do so.

kon-foo avatar Oct 16 '24 12:10 kon-foo

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Dec 16 '24 09:12 github-actions[bot]

This issue was closed because it has been stalled for 30 days with no activity. If this issue is still relevant, please re-open a new issue.

github-actions[bot] avatar Jan 16 '25 09:01 github-actions[bot]