zenodo-rdm
zenodo-rdm copied to clipboard
Sync with Github error (Request failed with status code: 504)
This is a re-opening of #957 and #1107. Please see the discussions there.
Issue is (additionally) related to the number of organizations and repositories a user is a member of.
The only workaround currently is to disconnect repositories and reconnect them, which for users with many repositories is very sub-optimal.
Related to Zenodo Support email [Ticket#332699]
I suggest reopening #957 until the issue is actually fixed (and not just by the suboptimal workaround).
I consistently get this error too.
Same here, on 2025-02-18: I clicked the sync button multiple times.
Directly after clicking on the sync button for the ~10th time, the server crashed ... :-/ . Unsure if it was due to that...
When the server came back up, I tried again, no success yet.
Thanks for taking on the torch, @thawn! 🔥 I will update my support ticket with this version as well.
I managed to go around the issue by log-out and then log-in sequence.
But it would be great to see it fixed.
I managed to go around the issue by log-out and then log-in sequence.
I tried so too. Does not work for me ... :-/
I tried so too. Does not work for me ...
I tried creating a new public repository to test the functionality. I cannot reproduce my previous solution but at some point (within few minutes) it showed up in the Zenodo list despite 504 errors. If you have a list but a new repository is missing it can also might be to e.g. organizational policies etc. Happened to me...
https://github.com/alin256/zenodo-test
I get a 400 error after trying too often to sync (and getting the 504 error many times).
I assume the problem is that I am a member of multiple organisations, some having hundreds of repositories...? I can imagine that it may take too long to sync...
Here a summary of those involved in this protracted discussion to date:
| User | Public Repos | Public Orgs | First Comment |
|---|---|---|---|
| Issue 957 | |||
| AliciaMstt | 32 | 1 | Apr 13, 2024 |
| yarikoptic | 1.1k | 25* |
Jun 16, 2024 (issue 895) |
| joshmoore | 377 | 32* |
Jul 16, 2024 |
| wsnoble | 2 | 0 | Jul 26, 2024 |
| zimolzak | 135 | 0 | Jul 30, 2024 |
| Blake Naccarato | 62 | 1 | Sep 22, 2024 |
| Peter Dudfield | 11 | 0 | Oct 25, 2024 |
| thawn | 34 | 0 | Nov 18, 2024 |
| e-kotov | 74 | 5 | Dec 18, 2024 |
| alin256 | 144 | 0 | Jan 29 |
| Issue 1107 | |||
| krokicki | 47 | 5* |
Jan 7 |
| bethac07 | 32 | 5 | Jan 15 |
| caufieldjh | 57 | 3* |
Jan 23 |
| lldelisle | 75 | 1 | Jan 27 |
| Vladimir Alexiev | 113 | 8* |
Jan 29 |
| Issue 1118 (here) | |||
| cboettig | 179 | 10* |
Feb 16 |
| Richèl Bilderbeek | 1.4K | 5* |
Feb 18 |
| Philip Chase | 106 | 1 | Apr 15 |
*: at least one org with more than 100 repositories
cc top-contributors: @slint @alejandromumo @zzacharo @yashlamba
I'm now wondering if there could be some correlation with e.g. a server-side deployment or cache invalidation... 🤔
@joshmoore thanks for taking the systematic approach! I think you may be on to something here. Added info: I am a member of two private organizations, one of which has 50 repos, the other 21.
An update from Zenodo Support email [Ticket#332699]:
Thank you for your investigative work in the attached issue, that is really helpful for us to see the scale of the issue at hand.
We're currently looking into how to circumvent this sort of limitation and reduce both the number of repositories that need to be synchronised and displayed on the GitHub settings page. Some possibilities would be to group repositories by their GitHub organisation, depending on if they're forks/sources, or how recently they were active. If you have any opinions on this matter we would be grateful to hear them
I'm also affected... :/ Let me know if I can help debugging this issue somehow
Carlin from Zenodo reached out saying that the sync method is the one that is being investigated. I extracted this gist and found that it takes a few seconds to choose some 1250 of 4400 repositories for further processing (Those starting with "ok"). If you want to run that, it might be a more useful metric than what I did above.
I am also experiencing the 504 error code when trying to "Sync now". It took a while, but after about 20 minutes, new GitHub repos appeared in Zenodo. If I read the other comments correctly, this is improved from the previous descriptions of this problem.
FYI, I am in four organizations. Their repo counts are 405, 5, 6, and 1. My sync issues are with the org with 405 repos.
FWIW, you can list me affected as well, subscribing. I thought I did report back but may be not (since do not see myself listed). ATM says "(updated a year ago)" and I might have some thousands of repos across some dozens of organizations.
edit: for a workaround, I was hoping we could toggle via API somehow but failed to find anything related on https://developers.zenodo.org/ :-/
I now found my elderly issue, @joshmoore please add to your table?
- https://github.com/zenodo/zenodo-rdm/issues/895
Updated @yarikoptic and @pbchase. Thanks! Others, keep 'em comin'.
I'm facing a similar issue. I'm trying to archive a github repo on zenodo for submission of a manuscript, and while everything seemed to work fine on https://sandbox.zenodo.org/login, I'm also getting a Request failed with status code: 400 error when I try to sync my repositories, and the list says it was updated a year ago.
I've tried to revoke OAuth access from github's site, but I can't seem to find a way to re-trigger zenodo to ask for access though.
Any ideas?
EDIT: logging into zenodo via github, rather than my orcid account, prompted me to regrant oAuth access, and that fixed the sync issue as well.
logging into zenodo via github, rather than my orcid account, prompted me to regrant oAuth access, and that fixed the sync issue as well.
That might be a better way to fix the 400 error when sync fails.
Did your activated git repos in Zenodo remain activated? That would improve the delete-your-github-account-from-zenodo method that erases all of the activations.
FWIW: I am logged in via github and I see "Request failed with status code: 504". Given that 504 is "Gateway Timeout" it makes sense for cases with a considerable number of repos associated and request is not waiting long enough and times out due to its own timeout policies. I suspect that in other cases (e.g. 400 Bad Request ) might be a different problem of malformed request for some reason.
likely this is the function where the error comes from
- https://github.com/inveniosoftware/invenio-github/blob/b3812241ebcf1376d4e162b7bdca6c66f19fc21c/invenio_github/assets/semantic-ui/js/invenio_github/index.js#L67
and may be we should just propose to increase time outs there, in particular at the lineabove where it is set to /** Timeout set to 100000 ms = 1m40s .*/
FWIW, I have tried to simulate that manually from CLI but got 400 , so likely not entirely proper request or may be incorrectly extracted token (changing it also did not change error code)
❯ curl -X POST "https://zenodo.org/api/user/github/repositories/sync" \
-H "Content-Type: application/json" \
-H "X-CSRFToken: $csrftoken"
{"message":"The browser (or proxy) sent a request that this server could not understand.","status":400}
edit: monitored in the browser, request is indeed correct but likely need to provide more into it, and it times out at 30sec
edit2: actual sync function is at https://github.com/inveniosoftware/invenio-github/blob/b3812241ebcf1376d4e162b7bdca6c66f19fc21c/invenio_github/views/github.py#L142 . But since it is gateway timeout, it is likely a generic timeout limit for API set at the platform level somewhere...
Filed an issue there
- https://github.com/inveniosoftware/invenio-app-rdm/issues/3032
for a good measure since there are some timeouts in the code so there is a chance it is not deployment specific.
Did your activated git repos in Zenodo remain activated? That would improve the delete-your-github-account-from-zenodo method that erases all of the activations.
I didn't have any activated git repos prior to this, so I can't answer this I'm afraid.
I consistently have this problem as well! I'm logged in to Zenodo with my GitHub account. I believe I am part of several organizations on GitHub too.
Just hopping on the thread to note that I noticed new github releases weren't getting picked up. Tried the usual disconnecting github and then reconnecting and toggling switches back on, but still not syncing and getting 504 if I hit the sync button.
Adding a +1 to this issue - seems that trying to sync, logging out and in again resolved it , per https://github.com/zenodo/zenodo-rdm/issues/1118#issuecomment-2665116236
I am having the same issue.
+1 with this issue.
I had originally reported our situation on the old thread, so I just wanted to update here. My main org JaneliaSciComp has 274 public repositories. Years ago I was able to pull everything in and generate DOIs for about a dozen of those repos, and those are still working well. But starting two years ago, the sync ceased to function, with the 504 timeout:
I wonder if the developers would consider adding a way to add a single repository, bypassing the scan entirely. That is all I would need to use this tool for new repositories that we want to publish. I'd be perfectly happy to type in the repo names or URLs by hand, one-by-one.
may be they can use this code to fix it : https://github.com/AkashRajpurohit/git-sync
or this one https://github.com/josegonzalez/python-github-backup