GC issue when using S3 as backend storage
Hi,
we're facing an issue with the GC when using S3 as backend storage. There already closed issues with similar behavior (like #18896) - but none of them with an sufficient closing reason.
Release: Harbor 2.13.1 Problem: GC only removes unused manifest objects from S3. Blobs and Layers are remaining on S3. Bucket is set up without versioning. See logs below.
Before GC:
# aws s3 ls s3://registry/ --recursive
2025-06-23 07:16:17 5510 docker/registry/v2/blobs/sha256/85/85f392b721880238e07f28855292cf69d1d30b87c5730441e020378571db565c/data
2025-06-23 07:16:18 625 docker/registry/v2/blobs/sha256/bc/bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6/data
2025-06-23 07:16:16 138165992 docker/registry/v2/blobs/sha256/fa/fa0600c9647164ba2afff0613cc464cd2c882d5a228057d41932a60f13be4486/data
2025-06-23 07:16:17 71 docker/registry/v2/repositories/janek/test1/_layers/sha256/85f392b721880238e07f28855292cf69d1d30b87c5730441e020378571db565c/link
2025-06-23 07:16:16 71 docker/registry/v2/repositories/janek/test1/_layers/sha256/fa0600c9647164ba2afff0613cc464cd2c882d5a228057d41932a60f13be4486/link
2025-06-23 07:16:18 71 docker/registry/v2/repositories/janek/test1/_manifests/revisions/sha256/bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6/link
2025-06-23 07:16:18 71 docker/registry/v2/repositories/janek/test1/_manifests/tags/latest/current/link
2025-06-23 07:16:18 71 docker/registry/v2/repositories/janek/test1/_manifests/tags/latest/index/sha256/bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6/link
After GC via Harbor UI (with untagged-artifacts checked):
registry:
...
time="2025-06-23T10:01:19.488973985Z" level=info msg="authorized request" go.version=go1.23.8 http.request.host="registry:5000" http.request.id=c5257b96-1257-46bc-b603-b75bf555ad7a http.request.method=HEAD http.request.remoteaddr="x.x.8.30:44044" http.request.uri="/v2/janek/test1/manifests/sha256:bc20a1d4cfd32f39
15d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6" http.request.useragent=harbor-registry-client vars.name="janek/test1" vars.reference="sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6"
time="2025-06-23T10:01:19.513238586Z" level=info msg="redis: connect redis:6379" go.version=go1.23.8 instance.id=7be07b2e-4e6a-4c28-8dee-95e91ef99831 redis.connect.duration=2.440368ms service=registry version=v2.8.3-23-g0c62ec3e.m
time="2025-06-23T10:01:19.52625557Z" level=info msg="response completed" go.version=go1.23.8 http.request.host="registry:5000" http.request.id=c5257b96-1257-46bc-b603-b75bf555ad7a http.request.method=HEAD http.request.remoteaddr="x.x.8.30:44044" http.request.uri="/v2/janek/test1/manifests/sha256:bc20a1d4cfd32f391
5d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6" http.request.useragent=harbor-registry-client http.response.contenttype="application/vnd.oci.image.manifest.v1+json" http.response.duration=325.814066ms http.response.status=200 http.response.written=625
x.x.8.30 - - [23/Jun/2025:10:01:19 +0000] "HEAD /v2/janek/test1/manifests/sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6 HTTP/1.1" 200 625 "" "harbor-registry-client"
time="2025-06-23T10:01:19.820529047Z" level=info msg="authorized request" go.version=go1.23.8 http.request.host="registry:5000" http.request.id=15763a7b-c3ba-4967-a29d-b9a644e6e8a2 http.request.method=DELETE http.request.remoteaddr="x.x.8.30:44044" http.request.uri="/v2/janek/test1/manifests/sha256:bc20a1d4cfd32f
3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6" http.request.useragent=harbor-registry-client vars.name="janek/test1" vars.reference="sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6"
x.x.8.30 - - [23/Jun/2025:10:01:19 +0000] "DELETE /v2/janek/test1/manifests/sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6 HTTP/1.1" 202 0 "" "harbor-registry-client"
time="2025-06-23T10:01:19.920116344Z" level=info msg="response completed" go.version=go1.23.8 http.request.host="registry:5000" http.request.id=15763a7b-c3ba-4967-a29d-b9a644e6e8a2 http.request.method=DELETE http.request.remoteaddr="x.x.8.30:44044" http.request.uri="/v2/janek/test1/manifests/sha256:bc20a1d4cfd32f
3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6" http.request.useragent=harbor-registry-client http.response.duration=388.118907ms http.response.status=202 http.response.written=0
...
jobservice:
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:162]: Garbage Collection parameters: [delete_untagged: true, dry_run: false, time_window: 2, workers: 1]
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:172]: start to run gc in job.
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:546]: start to delete untagged artifact (no actually deletion for dry-run mode)
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:580]: end to delete untagged artifact (no actually deletion for dry-run mode)
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:595]: artifact trash candidates.
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:597]: ID-38 MediaType-application/vnd.oci.image.config.v1+json ManifestMediaType-application/vnd.oci.image.manifest.v1+json RepositoryName-janek/test1 Digest-sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6 CreationTime-2025-06-23 07:21:54
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:261]: blob eligible for deletion: sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:261]: blob eligible for deletion: sha256:fa0600c9647164ba2afff0613cc464cd2c882d5a228057d41932a60f13be4486
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:261]: blob eligible for deletion: sha256:85f392b721880238e07f28855292cf69d1d30b87c5730441e020378571db565c
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:273]: 2 blobs and 1 manifests eligible for deletion
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:274]: The GC could free up 131 MB space, the size is a rough estimation.
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:339]: [f1fe6707-331f-4c70-8a07-6afdf1628b83][1/3] delete the manifest with registry v2 API: janek/test1, application/vnd.oci.image.manifest.v1+json, sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6
2025-06-23T10:01:19Z [INFO] [/pkg/config/rest/rest.go:47]: get configuration from url: http://core/api/v2.0/internalconfig
2025-06-23T10:01:19Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:368]: [f1fe6707-331f-4c70-8a07-6afdf1628b83][1/3] delete manifest from storage: sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6
2025-06-23T10:01:19Z [INFO] [/pkg/config/rest/rest.go:47]: get configuration from url: http://core/api/v2.0/internalconfig
2025-06-23T10:01:21Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:396]: [f1fe6707-331f-4c70-8a07-6afdf1628b83][1/3] delete artifact blob record from database: 38, janek/test1, sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6
2025-06-23T10:01:21Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:404]: [f1fe6707-331f-4c70-8a07-6afdf1628b83][1/3] delete artifact trash record from database: 38, janek/test1, sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6
2025-06-23T10:01:21Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:422]: [f1fe6707-331f-4c70-8a07-6afdf1628b83][1/3] delete blob from storage: sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6
2025-06-23T10:01:24Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:451]: [f1fe6707-331f-4c70-8a07-6afdf1628b83][1/3] delete blob record from database: 39, sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6
2025-06-23T10:01:24Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:422]: [f1fe6707-331f-4c70-8a07-6afdf1628b83][2/3] delete blob from storage: sha256:fa0600c9647164ba2afff0613cc464cd2c882d5a228057d41932a60f13be4486
2025-06-23T10:01:26Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:451]: [f1fe6707-331f-4c70-8a07-6afdf1628b83][2/3] delete blob record from database: 37, sha256:fa0600c9647164ba2afff0613cc464cd2c882d5a228057d41932a60f13be4486
2025-06-23T10:01:26Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:422]: [f1fe6707-331f-4c70-8a07-6afdf1628b83][3/3] delete blob from storage: sha256:85f392b721880238e07f28855292cf69d1d30b87c5730441e020378571db565c
2025-06-23T10:01:26Z [INFO] [/pkg/config/rest/rest.go:47]: get configuration from url: http://core/api/v2.0/internalconfig
2025-06-23T10:01:28Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:451]: [f1fe6707-331f-4c70-8a07-6afdf1628b83][3/3] delete blob record from database: 38, sha256:85f392b721880238e07f28855292cf69d1d30b87c5730441e020378571db565c
2025-06-23T10:01:28Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:480]: 2 blobs and 1 manifests are actually deleted
2025-06-23T10:01:28Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:481]: The GC job actual frees up 131 MB space.
2025-06-23T10:01:28Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:518]: cache clean up completed
2025-06-23T10:01:28Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:200]: success to run gc in job.
2025-06-23T10:01:28Z [INFO] [/jobservice/runner/redis.go:152]: Job 'GARBAGE_COLLECTION:fd9b8e866ac2cfe0ef019dd6' exit with success
# aws s3 ls s3://registry/ --recursive
2025-06-23 07:16:17 5510 docker/registry/v2/blobs/sha256/85/85f392b721880238e07f28855292cf69d1d30b87c5730441e020378571db565c/data
2025-06-23 07:16:18 625 docker/registry/v2/blobs/sha256/bc/bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6/data
2025-06-23 07:16:16 138165992 docker/registry/v2/blobs/sha256/fa/fa0600c9647164ba2afff0613cc464cd2c882d5a228057d41932a60f13be4486/data
2025-06-23 07:16:17 71 docker/registry/v2/repositories/janek/test1/_layers/sha256/85f392b721880238e07f28855292cf69d1d30b87c5730441e020378571db565c/link
2025-06-23 07:16:16 71 docker/registry/v2/repositories/janek/test1/_layers/sha256/fa0600c9647164ba2afff0613cc464cd2c882d5a228057d41932a60f13be4486/link
Downloaded files via s3 cp to verify whether files are not "empty".
Then, I triggered the GC via the registry pod
sh-5.2$ registry_DO_NOT_USE_GC garbage-collect --delete-untagged /etc/registry/config.yml
janek/test1
0 blobs marked, 3 blobs and 0 manifests eligible for deletion
blob eligible for deletion: sha256:85f392b721880238e07f28855292cf69d1d30b87c5730441e020378571db565c
INFO[0000] Deleting blob: /docker/registry/v2/blobs/sha256/85/85f392b721880238e07f28855292cf69d1d30b87c5730441e020378571db565c go.version=go1.23.8 instance.id=ec24fe74-2c56-497f-aba7-0089ff92dd52 service=registry
blob eligible for deletion: sha256:bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6
INFO[0000] Deleting blob: /docker/registry/v2/blobs/sha256/bc/bc20a1d4cfd32f3915d95b11d781c9f3d6035de1a9f41fd888de6fb551863dc6 go.version=go1.23.8 instance.id=ec24fe74-2c56-497f-aba7-0089ff92dd52 service=registry
blob eligible for deletion: sha256:fa0600c9647164ba2afff0613cc464cd2c882d5a228057d41932a60f13be4486
INFO[0000] Deleting blob: /docker/registry/v2/blobs/sha256/fa/fa0600c9647164ba2afff0613cc464cd2c882d5a228057d41932a60f13be4486 go.version=go1.23.8 instance.id=ec24fe74-2c56-497f-aba7-0089ff92dd52 service=registry
And the blobs got deleted... but empty layer links are remaining.
# aws s3 ls s3://registry/ --recursive
2025-06-23 07:16:17 71 docker/registry/v2/repositories/janek/test1/_layers/sha256/85f392b721880238e07f28855292cf69d1d30b87c5730441e020378571db565c/link
2025-06-23 07:16:16 71 docker/registry/v2/repositories/janek/test1/_layers/sha256/fa0600c9647164ba2afff0613cc464cd2c882d5a228057d41932a60f13be4486/link
It seems to be an issue with Distribution Registry and/or Harbor itself. We could workaround with the manual GC run, to clean up the blobs, but overall it should be treated as an important bug.
Cheers, Janek
Did you check the permission to your S3 bucket?
Sure, otherwise nothing could be deleted.
tl;dr:
- Harbor GC only cleans up manifest objects from S3 (jobservice log shows that other objects like blobs are deleted though)
- Direct run of registry_DO_NOT_USE_GC will remove the orphaned blobs
- Still leftovers: empty layers
@sizowie Thanks for writing up this issue, are you using AWS S3 or S3 compatible storage when you see this problem?
@sizowie Thanks for writing up this issue, are you using AWS S3 or S3 compatible storage when you see this problem?
S3 compatible storage. Tested with Hitachi Content Platform and NetApp StorageGrid.
The same issue is happening on ceph S3 storage.
@sizowie can you share the registryctl log? And it is possible to see the logs in the s3 server side? The harbor GC are actually share the same code base with distribution, which is to call the storage driver to delete files.
@sizowie and can you set up a docker distribution v2.8.3 with the same object storage? And try with the GC?
Sorry for the late response, I had no free time to set up a test-environment for that and I also don't have access to the S3 infrastructure.
But I found the problem. For each blob that needs to be purged Harbors GC (registryctl) logs
2025-10-01T13:47:34Z [DEBUG] [/lib/http/error.go:63]: {"errors":[{"code":"NOT_FOUND","message":"s3aws: Path not found: /docker/registry/v2/blobs/sha256/15/15c4ac6d4798ad6300c267fe5e09efecb4cb0afe4ce7a276cfeb50ce24a40a31"}]}
Other than on regular filesystems, S3 doesn't have "directory"-objects. These objects doesn't exist in S3 (that's why we see a "Path not found"), instead you find /docker/registry/v2/.../<sha>/data objects. These objects must be removed instead.
This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.
This issue is still valid - I've ran into it on a few different S3-compat stores now
Make sure deployment registry has s3 credential. MR related: https://github.com/goharbor/harbor-helm/pull/1545