zenodo-rdm icon indicating copy to clipboard operation
zenodo-rdm copied to clipboard

Sync with Github error (Request failed with status code: 504)

Open thawn opened this issue 10 months ago • 44 comments

This is a re-opening of #957 and #1107. Please see the discussions there.

Image

Issue is (additionally) related to the number of organizations and repositories a user is a member of.

The only workaround currently is to disconnect repositories and reconnect them, which for users with many repositories is very sub-optimal.

Related to Zenodo Support email [Ticket#332699]

I suggest reopening #957 until the issue is actually fixed (and not just by the suboptimal workaround).

thawn avatar Feb 02 '25 12:02 thawn

I consistently get this error too.

cboettig avatar Feb 15 '25 23:02 cboettig

Same here, on 2025-02-18: I clicked the sync button multiple times.

Directly after clicking on the sync button for the ~10th time, the server crashed ... :-/ . Unsure if it was due to that...

When the server came back up, I tried again, no success yet.

richelbilderbeek avatar Feb 18 '25 08:02 richelbilderbeek

This is a re-opening of #957 and #1107

Thanks for taking on the torch, @thawn! 🔥 I will update my support ticket with this version as well.

joshmoore avatar Feb 18 '25 08:02 joshmoore

I managed to go around the issue by log-out and then log-in sequence.

But it would be great to see it fixed.

alin256 avatar Feb 18 '25 09:02 alin256

I managed to go around the issue by log-out and then log-in sequence.

I tried so too. Does not work for me ... :-/

richelbilderbeek avatar Feb 18 '25 09:02 richelbilderbeek

I tried so too. Does not work for me ...

I tried creating a new public repository to test the functionality. I cannot reproduce my previous solution but at some point (within few minutes) it showed up in the Zenodo list despite 504 errors. If you have a list but a new repository is missing it can also might be to e.g. organizational policies etc. Happened to me...

https://github.com/alin256/zenodo-test

alin256 avatar Feb 18 '25 10:02 alin256

I get a 400 error after trying too often to sync (and getting the 504 error many times).

I assume the problem is that I am a member of multiple organisations, some having hundreds of repositories...? I can imagine that it may take too long to sync...

richelbilderbeek avatar Feb 18 '25 10:02 richelbilderbeek

Here a summary of those involved in this protracted discussion to date:

User Public Repos Public Orgs First Comment
Issue 957
AliciaMstt 32 1 Apr 13, 2024
yarikoptic 1.1k 25* Jun 16, 2024 (issue 895)
joshmoore 377 32* Jul 16, 2024
wsnoble 2 0 Jul 26, 2024
zimolzak 135 0 Jul 30, 2024
Blake Naccarato 62 1 Sep 22, 2024
Peter Dudfield 11 0 Oct 25, 2024
thawn 34 0 Nov 18, 2024
e-kotov 74 5 Dec 18, 2024
alin256 144 0 Jan 29
Issue 1107
krokicki 47 5* Jan 7
bethac07 32 5 Jan 15
caufieldjh 57 3* Jan 23
lldelisle 75 1 Jan 27
Vladimir Alexiev 113 8* Jan 29
Issue 1118 (here)
cboettig 179 10* Feb 16
Richèl Bilderbeek 1.4K 5* Feb 18
Philip Chase 106 1 Apr 15

*: at least one org with more than 100 repositories

cc top-contributors: @slint @alejandromumo @zzacharo @yashlamba

joshmoore avatar Feb 18 '25 10:02 joshmoore

I'm now wondering if there could be some correlation with e.g. a server-side deployment or cache invalidation... 🤔

joshmoore avatar Feb 18 '25 10:02 joshmoore

@joshmoore thanks for taking the systematic approach! I think you may be on to something here. Added info: I am a member of two private organizations, one of which has 50 repos, the other 21.

thawn avatar Feb 19 '25 14:02 thawn

An update from Zenodo Support email [Ticket#332699]:

Thank you for your investigative work in the attached issue, that is really helpful for us to see the scale of the issue at hand.

We're currently looking into how to circumvent this sort of limitation and reduce both the number of repositories that need to be synchronised and displayed on the GitHub settings page. Some possibilities would be to group repositories by their GitHub organisation, depending on if they're forks/sources, or how recently they were active. If you have any opinions on this matter we would be grateful to hear them

joshmoore avatar Feb 21 '25 14:02 joshmoore

I'm also affected... :/ Let me know if I can help debugging this issue somehow

Hoeze avatar Feb 26 '25 14:02 Hoeze

Carlin from Zenodo reached out saying that the sync method is the one that is being investigated. I extracted this gist and found that it takes a few seconds to choose some 1250 of 4400 repositories for further processing (Those starting with "ok"). If you want to run that, it might be a more useful metric than what I did above.

joshmoore avatar Feb 26 '25 18:02 joshmoore

I am also experiencing the 504 error code when trying to "Sync now". It took a while, but after about 20 minutes, new GitHub repos appeared in Zenodo. If I read the other comments correctly, this is improved from the previous descriptions of this problem.

FYI, I am in four organizations. Their repo counts are 405, 5, 6, and 1. My sync issues are with the org with 405 repos.

pbchase avatar Apr 09 '25 20:04 pbchase

FWIW, you can list me affected as well, subscribing. I thought I did report back but may be not (since do not see myself listed). ATM says "(updated a year ago)" and I might have some thousands of repos across some dozens of organizations.

edit: for a workaround, I was hoping we could toggle via API somehow but failed to find anything related on https://developers.zenodo.org/ :-/

yarikoptic avatar Apr 14 '25 17:04 yarikoptic

I now found my elderly issue, @joshmoore please add to your table?

  • https://github.com/zenodo/zenodo-rdm/issues/895

yarikoptic avatar Apr 14 '25 17:04 yarikoptic

Updated @yarikoptic and @pbchase. Thanks! Others, keep 'em comin'.

joshmoore avatar Apr 15 '25 08:04 joshmoore

I'm facing a similar issue. I'm trying to archive a github repo on zenodo for submission of a manuscript, and while everything seemed to work fine on https://sandbox.zenodo.org/login, I'm also getting a Request failed with status code: 400 error when I try to sync my repositories, and the list says it was updated a year ago.

I've tried to revoke OAuth access from github's site, but I can't seem to find a way to re-trigger zenodo to ask for access though.

Any ideas?

EDIT: logging into zenodo via github, rather than my orcid account, prompted me to regrant oAuth access, and that fixed the sync issue as well.

pmoris avatar Apr 16 '25 15:04 pmoris

logging into zenodo via github, rather than my orcid account, prompted me to regrant oAuth access, and that fixed the sync issue as well.

That might be a better way to fix the 400 error when sync fails.

Did your activated git repos in Zenodo remain activated? That would improve the delete-your-github-account-from-zenodo method that erases all of the activations.

pbchase avatar Apr 16 '25 15:04 pbchase

FWIW: I am logged in via github and I see "Request failed with status code: 504". Given that 504 is "Gateway Timeout" it makes sense for cases with a considerable number of repos associated and request is not waiting long enough and times out due to its own timeout policies. I suspect that in other cases (e.g. 400 Bad Request ) might be a different problem of malformed request for some reason.

yarikoptic avatar Apr 16 '25 16:04 yarikoptic

likely this is the function where the error comes from

  • https://github.com/inveniosoftware/invenio-github/blob/b3812241ebcf1376d4e162b7bdca6c66f19fc21c/invenio_github/assets/semantic-ui/js/invenio_github/index.js#L67

and may be we should just propose to increase time outs there, in particular at the lineabove where it is set to /** Timeout set to 100000 ms = 1m40s .*/

FWIW, I have tried to simulate that manually from CLI but got 400 , so likely not entirely proper request or may be incorrectly extracted token (changing it also did not change error code)

❯ curl -X POST "https://zenodo.org/api/user/github/repositories/sync" \
     -H "Content-Type: application/json" \
     -H "X-CSRFToken: $csrftoken"
{"message":"The browser (or proxy) sent a request that this server could not understand.","status":400}

edit: monitored in the browser, request is indeed correct but likely need to provide more into it, and it times out at 30sec

Image

edit2: actual sync function is at https://github.com/inveniosoftware/invenio-github/blob/b3812241ebcf1376d4e162b7bdca6c66f19fc21c/invenio_github/views/github.py#L142 . But since it is gateway timeout, it is likely a generic timeout limit for API set at the platform level somewhere...

yarikoptic avatar Apr 16 '25 18:04 yarikoptic

Filed an issue there

  • https://github.com/inveniosoftware/invenio-app-rdm/issues/3032

for a good measure since there are some timeouts in the code so there is a chance it is not deployment specific.

yarikoptic avatar Apr 18 '25 23:04 yarikoptic

Did your activated git repos in Zenodo remain activated? That would improve the delete-your-github-account-from-zenodo method that erases all of the activations.

I didn't have any activated git repos prior to this, so I can't answer this I'm afraid.

pmoris avatar Apr 19 '25 14:04 pmoris

I consistently have this problem as well! I'm logged in to Zenodo with my GitHub account. I believe I am part of several organizations on GitHub too.

cjvanlissa avatar Apr 28 '25 12:04 cjvanlissa

Just hopping on the thread to note that I noticed new github releases weren't getting picked up. Tried the usual disconnecting github and then reconnecting and toggling switches back on, but still not syncing and getting 504 if I hit the sync button.

yoachim avatar May 04 '25 18:05 yoachim

Adding a +1 to this issue - seems that trying to sync, logging out and in again resolved it , per https://github.com/zenodo/zenodo-rdm/issues/1118#issuecomment-2665116236

geryan avatar May 06 '25 00:05 geryan

I am having the same issue.

tfjmp avatar Jun 05 '25 16:06 tfjmp

+1 with this issue.

chrishavlin avatar Jun 12 '25 16:06 chrishavlin

I had originally reported our situation on the old thread, so I just wanted to update here. My main org JaneliaSciComp has 274 public repositories. Years ago I was able to pull everything in and generate DOIs for about a dozen of those repos, and those are still working well. But starting two years ago, the sync ceased to function, with the 504 timeout:

Image

I wonder if the developers would consider adding a way to add a single repository, bypassing the scan entirely. That is all I would need to use this tool for new repositories that we want to publish. I'd be perfectly happy to type in the repo names or URLs by hand, one-by-one.

krokicki avatar Jun 18 '25 17:06 krokicki

may be they can use this code to fix it : https://github.com/AkashRajpurohit/git-sync

or this one https://github.com/josegonzalez/python-github-backup

priya-gitTest avatar Jun 18 '25 20:06 priya-gitTest