harbor icon indicating copy to clipboard operation
harbor copied to clipboard

Harbor Garbage Collection goes immediately and stays in Pending status

Open mdavid01 opened this issue 7 months ago • 2 comments

This issue has been previously reported with no resolution. Perhaps this explanation better describes the situation. The same issue has been posted to github distribution #4644.

We are running harbor registry 2.12 using Harbor Helm config in two AWS TKG (actually Palette) environments of about the same size (Env 1=100tb, Env 2=136tb and growing). We use an external AWS RDS postgresql database with S3 storage for manifests, layers, and blobs. Env 1 runs Harbor garbage collection daily and successfully. When executed, the GC TASK table row goes immediately to 'Running' status as expected.

Env 2 is growing because garbage collection, when executed, goes immediately into 'Pending' status (per Postgresql TASK table). The Env 2 ARTIFACT_TRASH table now contains 107,000 entries vs Env 1's 5,000+. No ARTIFACT_TRASH records are being deleted. Env 2 Garbage collection was running fine for about a year until 2025-03-07. Its TASK suddenly began going into Pending status right from the beginning of the GC execution. We are not aware of any events on 03-06 or 03-07 that might have impacted GC.

We figure Env 2 hit some condition (perhaps from distribution) and returned a status code that Harbor interprets as 'Pending'. Conditions might include data sync situations caused by a prior network, Palette, Harbor, security or infrastructure failure. The Harbor application in Env 2 continues to operate as expected for our 6,000 users. GC is the only concern.

Can you offer one or more conditions in the Distribution or Harbor app that would cause Harbor to set the GC task status to Pending? What other info can we provide

FROM POSTGRESQL EXECUTION TABLE

id vendor_type vendor_id status status_message trigger extra_attrs start_time end_time revision update_time
5272810 GARBAGE_COLLECTION -1 Running MANUAL {"delete_untagged":false,"dry_run":true,"redis_url_reg":"redis+sentinel://sentinel-harbor-gov-prod-redis:26379/mymaster/2?idle_timeout_seconds=30","time_window":2,"workers":5} 2025-05-30 12:46:22.770 2 2025-05-30 13:17:46.000

FROM POSTGRESQL TASK TABLE

id execution_id job_id status status_code status_revision status_message run_count extra_attrs creation_time start_time update_time end_time vendor_type
5868113 5272810 18bcde8fcb4a594400b80336 Pending 0 0 0 {} 2025-05-30 12:46:22.777 2025-05-30 12:46:22.777 GARBAGE_COLLECTION

mdavid01 avatar May 30 '25 14:05 mdavid01

Can you please give us the screenshot of the GC execution history? there should be an error GC job before the current GC job.

Because there is a previous GC job running in the background and it doesn't release the lock.

You could refer this discussion to fix the problem: https://github.com/goharbor/harbor/discussions/21188

or you can try this feature in Harbor 2.13.0 https://github.com/goharbor/harbor/pull/21390

It seems that the GC takes long time to run >24hours, you can try to increase the Garbage collection job's worker number to 10 to speed up.

stonezdj avatar Jun 03 '25 05:06 stonezdj

Thx DJ. Message rec'd. These suggestions sound promising. it will take a few days to implement and test. We'll update status and results then. This issue can be closed and we'll re-open after implementation or leave open until then. whatever you prefer.

mdavid01 avatar Jun 03 '25 11:06 mdavid01

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] avatar Aug 03 '25 09:08 github-actions[bot]

This issue was closed because it has been stalled for 30 days with no activity. If this issue is still relevant, please re-open a new issue.

github-actions[bot] avatar Sep 02 '25 09:09 github-actions[bot]