Harbor Garbage Collection goes immediately and stays in Pending status
This issue has been previously reported with no resolution. Perhaps this explanation better describes the situation. The same issue has been posted to github distribution #4644.
We are running harbor registry 2.12 using Harbor Helm config in two AWS TKG (actually Palette) environments of about the same size (Env 1=100tb, Env 2=136tb and growing). We use an external AWS RDS postgresql database with S3 storage for manifests, layers, and blobs. Env 1 runs Harbor garbage collection daily and successfully. When executed, the GC TASK table row goes immediately to 'Running' status as expected.
Env 2 is growing because garbage collection, when executed, goes immediately into 'Pending' status (per Postgresql TASK table). The Env 2 ARTIFACT_TRASH table now contains 107,000 entries vs Env 1's 5,000+. No ARTIFACT_TRASH records are being deleted. Env 2 Garbage collection was running fine for about a year until 2025-03-07. Its TASK suddenly began going into Pending status right from the beginning of the GC execution. We are not aware of any events on 03-06 or 03-07 that might have impacted GC.
We figure Env 2 hit some condition (perhaps from distribution) and returned a status code that Harbor interprets as 'Pending'. Conditions might include data sync situations caused by a prior network, Palette, Harbor, security or infrastructure failure. The Harbor application in Env 2 continues to operate as expected for our 6,000 users. GC is the only concern.
Can you offer one or more conditions in the Distribution or Harbor app that would cause Harbor to set the GC task status to Pending? What other info can we provide
FROM POSTGRESQL EXECUTION TABLE
| id | vendor_type | vendor_id | status | status_message | trigger | extra_attrs | start_time | end_time | revision | update_time |
|---|---|---|---|---|---|---|---|---|---|---|
| 5272810 | GARBAGE_COLLECTION | -1 | Running | MANUAL | {"delete_untagged":false,"dry_run":true,"redis_url_reg":"redis+sentinel://sentinel-harbor-gov-prod-redis:26379/mymaster/2?idle_timeout_seconds=30","time_window":2,"workers":5} | 2025-05-30 12:46:22.770 | 2 | 2025-05-30 13:17:46.000 |
FROM POSTGRESQL TASK TABLE
| id | execution_id | job_id | status | status_code | status_revision | status_message | run_count | extra_attrs | creation_time | start_time | update_time | end_time | vendor_type |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5868113 | 5272810 | 18bcde8fcb4a594400b80336 | Pending | 0 | 0 | 0 | {} | 2025-05-30 12:46:22.777 | 2025-05-30 12:46:22.777 | GARBAGE_COLLECTION |
Can you please give us the screenshot of the GC execution history? there should be an error GC job before the current GC job.
Because there is a previous GC job running in the background and it doesn't release the lock.
You could refer this discussion to fix the problem: https://github.com/goharbor/harbor/discussions/21188
or you can try this feature in Harbor 2.13.0 https://github.com/goharbor/harbor/pull/21390
It seems that the GC takes long time to run >24hours, you can try to increase the Garbage collection job's worker number to 10 to speed up.
Thx DJ. Message rec'd. These suggestions sound promising. it will take a few days to implement and test. We'll update status and results then. This issue can be closed and we'll re-open after implementation or leave open until then. whatever you prefer.
This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.
This issue was closed because it has been stalled for 30 days with no activity. If this issue is still relevant, please re-open a new issue.