harbor 504 Gateway Time-out when replication use pull mode

Source Harbor version 1.7.1 Dest Harbor version 1.8.0

When I use pull mode to make regular backups, the source harbor repo have too many images tags, probably above tens of thousands, I get the following errors:

replication rule config:

and the other replication is normal, when start replica by pull mode, dest harbor will list all tags of soruce harbor repos, the source harbor api response too slow. I will modify nginx timeout config to retry.

Jun 13 '19 02:06 lf1029698952

The perf issue of listing tags is a known issue. We're working on to figure out a proper solution to improve it.

cc @ywk253100

Jun 14 '19 10:06 steven-zou

@cd1989 Do we have any solution to fix this as I noticed you added it into the 1.9 scope

Aug 19 '19 08:08 ywk253100

@cd1989 Do we have any solution to fix this as I noticed you added it into the 1.9 scope

Not yet, but I want to work on it in 1.9 scope.

Aug 20 '19 03:08 cd1989

@lf1029698952 The timeout error happened between the nginx and core on the source Harbor, changing the default timeout in nginx should be a workaround.

@cd1989 As the error happened on the 1.7 Harbor, I don't think we any solution to fix it in 1.9. Moving it out of 1.9

Aug 20 '19 05:08 ywk253100

I have change nginx timeout config to 10min, but the job will cost 10min and error, I suspect that the performance of the harbor API is so bad. This is an urgent problem to be solved. Thanks.

Aug 20 '19 06:08 lf1029698952

@lf1029698952 You need to change the timeout to large enough to let the API calls complete

Aug 21 '19 08:08 ywk253100

And some refactor is under designing, after that, the issue for listing tag API will disappear

Aug 21 '19 08:08 ywk253100

Please double check if this issue still exists in 2.x.

This is probably fixed after we store tag info in DB instead of registry.

May 25 '20 04:05 reasonerjt

Per the above comment, @lf1029698952 you can try with the latest Harbor(v2.5), and performance should be gone.

Apr 13 '22 09:04 wy65701436

I have just tested it on v2.5. We still experience 504 errors. The strange thing is that it only occours on the grafana replication rule. We have about 50 replication rules that pull images from Docker. We only experience this problem with the grafana images. I used skopeo inspect to find the right location. It returns docker.io/grafana/grafana. Is this problem related?

2022-04-20T14:44:54Z [INFO] [/pkg/reg/adapter/dockerhub/client.go:93]: GET https://hub.docker.com/v2/repositories/grafana/?page=1&page_size=100
2022-04-20T14:45:25Z [ERROR] [/pkg/reg/adapter/dockerhub/adapter.go:410]: list repos error: 504 -- <html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>

Apr 19 '22 10:04 gwiersma

We have the same problem, we were able to narrow it down to the page size in the dockerhub adapter

diff --git a/src/pkg/reg/adapter/dockerhub/adapter.go b/src/pkg/reg/adapter/dockerhub/adapter.go
index 0ff2fca65..6f6e8c7e3 100644
--- a/src/pkg/reg/adapter/dockerhub/adapter.go
+++ b/src/pkg/reg/adapter/dockerhub/adapter.go
@@ -254,7 +254,7 @@ func (a *adapter) FetchArtifacts(filters []*model.Filter) ([]*model.Resource, er
        log.Debugf("got %d namespaces", len(namespaces))
        for _, ns := range namespaces {
                page := 1
-               pageSize := 100
+               pageSize := 50
                n := 0
                for {
                        pageRepos, err := a.getRepos(ns, "", page, pageSize)
@@ -295,7 +295,7 @@ func (a *adapter) FetchArtifacts(filters []*model.Filter) ([]*model.Resource, er

                        var tags []string
                        page := 1
-                       pageSize := 100
+                       pageSize := 50
                        for {
                                pageTags, err := a.getTags(repo.Namespace, repo.Name, page, pageSize)
                                if err != nil {

the 504 is on the dockerhub api and occurs if the query is longer then 30 seconds

May 09 '22 12:05 NemesisRE

We are facing the same timeout issue with some dockerhub replication rules (grafana/loki). Is there any workaround?

Jun 07 '22 15:06 HammerNL89

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

Jul 08 '22 09:07 github-actions[bot]

We just faced the same issue on a Harbor->Harbor replication:

failed to fetch artifacts: failed to list artifacts of repository 'dev/jenkins/network-defense-manager': http error: code 504, message <html> <head><title>504 Gateway Time-out</title></head> <body> <center><h1>504 Gateway Time-out</h1></center> <hr><center>nginx</center> </body> </html>

This also caused an incident in target Harbor as it was brought down with a big increase in DB locks (the 2 arrows mark the start of two tries in triggering replication in the replication source Harbor instance):

Aug 05 '22 12:08 aitorpazos

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

Oct 05 '22 09:10 github-actions[bot]

Hi bot, this is not resolved AFAIK.

Oct 08 '22 14:10 aitorpazos

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

Dec 09 '22 09:12 github-actions[bot]

Hi bot, this is not resolved AFAIK.

Dec 30 '22 19:12 aitorpazos

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

Mar 01 '23 09:03 github-actions[bot]

Hi bot, this is not resolved AFAIK.

Mar 01 '23 14:03 aitorpazos

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

May 02 '23 09:05 github-actions[bot]

Not resolved AFAIK

May 24 '23 12:05 aitorpazos

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

Jul 25 '23 09:07 github-actions[bot]

Not resolved AFAIK

Jul 25 '23 09:07 aitorpazos

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

Sep 24 '23 09:09 github-actions[bot]

Not resolved AFAIK

Sep 25 '23 16:09 aitorpazos

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

Nov 26 '23 09:11 github-actions[bot]

Not resolved AFAIK

Nov 30 '23 10:11 aitorpazos

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

Jan 31 '24 09:01 github-actions[bot]

Not resolved AFAIK

Jan 31 '24 11:01 aitorpazos

harbor harbor copied to clipboard

504 Gateway Time-out when replication use pull mode

harbor
harbor copied to clipboard