soba icon indicating copy to clipboard operation
soba copied to clipboard

Ability to specify the GitLab instance URL

Open drummingdemon opened this issue 3 years ago • 9 comments

Can we define the GitLab instance URL? Would be really useful for making backups of private GitLab repos.

drummingdemon avatar Aug 17 '22 21:08 drummingdemon

It's currently hard-coded as https://gitlab.com/api/v4.
I don't think it'd take much to introduce an API URL override. Will take a look.

jonhadfield avatar Aug 18 '22 19:08 jonhadfield

Awesome, thanks so much in advance! 😀

drummingdemon avatar Aug 18 '22 19:08 drummingdemon

I'm not able to test myself as I don't have my own gitlab setup, but I've just introduced an undocumented option to specify an API URL override for each provider:

  • GITHUB_APIURL
  • GITLAB_APIURL
  • BITBUCKET_APIURL

The value needs to include the https:// prefix.
All I've tested is that no existing functionality is impacted, so please give it a go and let me know if it works.

https://github.com/jonhadfield/soba/releases/tag/1.1.3-beta

jonhadfield avatar Aug 18 '22 19:08 jonhadfield

Awesome, it connected flawlessly - thanks for the quick response!

Soba has backed up 26 repos of the 128, which at first glance is a bit weird as one page (of the seven total) has 20 repositories. The repos that soba had backed up are seemingly from across multiple pages - any idea on how to get the remaining repos to show up?

drummingdemon avatar Aug 18 '22 20:08 drummingdemon

Thanks for the feedback. I'm fairly certain this is linked to an existing issue I've not around to: https://github.com/jonhadfield/soba/issues/10. I'll try and take a look tomorrow.

jonhadfield avatar Aug 18 '22 21:08 jonhadfield

I've just put out a new release: https://github.com/jonhadfield/soba/releases/tag/1.1.3-beta.1
This adds pagination for group projects, which is where I suspect the issue is.
I may not be bringing back all groups though. How many do you have defined?

jonhadfield avatar Aug 21 '22 21:08 jonhadfield

Thanks for this - I have 12 groups in total within this GitLab instance. Nothing conceivable has changed with the amount of backed up repositories, sadly.

I did make two observations though:

  1. Only those Groups get backed up where I am the Owner of the Group
  2. Subgroups are skipped

drummingdemon avatar Aug 22 '22 07:08 drummingdemon

Ah, subgroups aren't aren't automatically returned when requesting all groups. It's a separate API call. That shouldn't take long to add though.
For 'Owner only Groups', that was kind of intentional. As I don't use GitLab myself, I was concerned it could return ones that were loosely related, but not ones you cared about. For example, in GitHub, a user in your Organisation could own many repos in their personal GitHub setup, but you wouldn't want to back those up. Anyhow, I'll see what's involved in adding it.

jonhadfield avatar Aug 22 '22 10:08 jonhadfield

That makes total sense, thank you!

drummingdemon avatar Aug 22 '22 10:08 drummingdemon

Sorry for the delay.
I've discovered there are various ways to retrieve Groups, and my above comment on sub-Groups was incorrect. In 1.1.3-beta.1 I was attempting to retrieve 'all available' and 'all owned by user', but the result was that the latter overrode the former.
I've settled on a new way of retrieving Groups: retrieving based on the user's minimum access level to the Group. Along with group pagination being added to 1.1.3-beta.2, this should return all Groups and sub-Groups you have at least Guest access to. More details on the release page, including a way to override the minimum access level.

Please let me know how you get on.

jonhadfield avatar Oct 01 '22 10:10 jonhadfield

No worries, thanks for getting back to me on this!

I've pulled beta.2, it is somewhat better: now it pulls 26 projects out of the 130 I have access to. The ones that get pulled have me as their owner - anything below that level does not get pulled. I've also tried setting GITLAB_GROUP_ACCESS_LEVEL_FILTER manually to 10 within the env vars, same result. Then elevated it to 30, still no changes - any suggestions on what should I try?

drummingdemon avatar Oct 01 '22 20:10 drummingdemon

It turns out retrieving Groups by minimum access level doesn't work across the whole of GitLab, but only local ones. The alternative is to retrieve Projects by minimum access level, so I've switched the behaviour in my code and proven this works as I can now clone repositories from another user where I have a Project access level of at least Reporter (I spent a lot of time wondering why Guest wasn't sufficient, until I RTFM).
The previous env var is now replaced with: GITLAB_PROJECT_MIN_ACCESS_LEVEL as the filter, where the value is an integer as follows:

20: "Reporter"
30: "Developer"
40: "Maintainer"
50: "Owner"

If unset, the default is 20.

As always, please shout if it works or doesn't.

jonhadfield avatar Oct 02 '22 11:10 jonhadfield

Thanks! I've tried without the env var and then with it, set to 20, 30 and finally 40 - all with the same result (the access levels were reported correctly across each run):

soba: 2022/10/02 11:43:30 gitlab.go:135: GitLab project minimum access level set to Reporter (20)
soba: 2022/10/02 11:43:30 gitlab.go:177: json: cannot unmarshal object into Go value of type githosts.gitLabGetProjectsResponse

drummingdemon avatar Oct 02 '22 11:10 drummingdemon

It seems the response you get from the API doesn't match the bit of code I use to store it. It's difficult to debug without knowing the response, so I've just pushed a new release that outputs the GitLab API response if you have an environment variable set as: SOBA_LOG=trace
Would you mind trying that and sending the output? It's the structure, more than the content, so anonymising any repo names, urls, etc. is fine. If easier, just email me at [email protected].
If it helps, the structure I'm expecting is a json list of records:

path
path_with_namespace
http_url_to_repo
ssh_url_to_repo
id
name
created_at 

Not sure all those fields are still useful in order to clone, so will take a note to review.  This is an example response (with only relevant fields kept) triggered by a test, where you see the response is json, starts with a [ to open the array/list, and is followed by a number of records (GitLab Projects) surrounded by { and }. After the final one there should be a closing ].

[
  {
    "id": 39877738,
    "name": "bourbon",
    "path": "bourbon",
    "path_with_namespace": "biscuits2/bourbon",
    "created_at": "2022-10-01T20:28:50.042Z",
    "ssh_url_to_repo": "[email protected]:biscuits2/bourbon.git",
    "http_url_to_repo": "https://gitlab.com/biscuits2/bourbon.git",
    ...
   },
   ...
]

I'm expecting your response may be malformed in some way, or I'm triggering the an API call that's invalid.

jonhadfield avatar Oct 02 '22 15:10 jonhadfield

Thanks, the trace option immediately let me know that I've used the wrong access token - I've also switched that between one of the runs to make sure I'm starting fresh and sadly copied over the wrong one, so apologies for that!

Anyhow now a bit more projects get loaded: 35 are present out of the total 130 I have access to.

It seems like projects where I'm Owner are the only the ones present in the reponse JSON - aside from a special case where a small fraction of a Group (where i'm Maintainer) is also present. From this Group, 4 projects are present out of the 37 (these 4 are not the first four in the Group's web frontend listing).

Let me know if I can try anything else, your quick responses are much appreciated! 👍

drummingdemon avatar Oct 02 '22 17:10 drummingdemon

I've used the wrong access token Ah, that's a use-case I've not checked for. Will add it to the list.

Is it possible your token doesn't have the necessary access to the other projects? The API call I make is simply specifying a page size, i.e. number of results to return with each call (20 by default), and the minimum access level mentioned above. In theory, that should return everything, regardless of ownership. https://docs.gitlab.com/ee/api/projects.html#list-all-projects

Please could you check the response you get from Gitlab (enabled with SOBA_LOG=trace) to see if a missing project (one you expected to be retrieved) is in the json output? I'm trying to work out if GitLab's API is providing the detail and I'm not acting upon it properly, or if GitLab is simply not returning them.

jonhadfield avatar Oct 02 '22 17:10 jonhadfield

I've issued this request using CocoaRestClient based on the docs you've linked:

https://<gitlab_url>/api/v4/projects?private_token=<token>&per_page=150

And it returned a JSON that is 12,313 lines long - it seems to contain most of the projects I have access to, although right off the bat, the second project that is visible on the web frontend is not present in this JSON response, while the very first is (I'm Developer in both projects).

The Access Token has all the tickboxes enabled:

  • api
  • read_api
  • read_user
  • read_repository
  • write_repository
  • read_registry
  • write_registry

Is there any other URL param I should add to the request? Or should I try a different endpoint?

drummingdemon avatar Oct 03 '22 17:10 drummingdemon

I worked out the pagination I added when retrieving projects via the groups endpoint was missing when I switched to making requests to the projects endpoint.
I've just pushed a new release that should now work: https://github.com/jonhadfield/soba/releases/tag/1.1.3-beta.5.

jonhadfield avatar Oct 04 '22 19:10 jonhadfield

Hats off to you, it now works as expected and has bundled up all the repos I have access to - awesome work! 🍻 This is now resolved with beta.5, so it can be closed now.


On a somehat related note: I was wondering whether it is somehow possible to fetch the latest commit from the previous bundle to speed up the consequent soba runs (this way the git clone and bundle steps might be spared) - but as far as I can tell, the only way would be to restore the bundle back to a repo and then compare the latest commits...?

drummingdemon avatar Oct 04 '22 21:10 drummingdemon

Great news. Thanks for the feedback.

I'll copy your request into another issue so I can close this one. Will respond on there.

jonhadfield avatar Oct 05 '22 17:10 jonhadfield