gharchive.org icon indicating copy to clipboard operation
gharchive.org copied to clipboard

GitLab support

Open olearycrew opened this issue 6 years ago • 6 comments

Ever thought about adding GitLab support to the project? Maybe forking the project/creating gitlabarchive.org?

olearycrew avatar Nov 13 '18 03:11 olearycrew

Interesting idea. I'm not familiar with GitLab.. do they provide equivalent / similar APIs for tracking public activity?

igrigorik avatar Nov 13 '18 03:11 igrigorik

They (we) do: https://docs.gitlab.com/ee/api/

olearycrew avatar Nov 13 '18 15:11 olearycrew

Ah, neat. So, one issue we have with the GH API is that we're hitting API limits and missing events. Ideally, instead of us polling, we'd be subscribing to a pubsub channel (e.g. GCP pubsub). Do you think this is something you guys would be willing to explore and support? I've been pushing GitHub folks to expose this as well.

igrigorik avatar Nov 17 '18 07:11 igrigorik

Ping @brendano86, in case you missed the response from @igrigorik 🙂

Having public pubsub channels for both GitLab and GitHub would be very nice for data analysis purposes like this repo.

voxpelli avatar Mar 03 '19 11:03 voxpelli

cc: @annafil anything we can do to support this?

hamelsmu avatar Mar 04 '19 17:03 hamelsmu

@olearycrew The GitLab API currently does not allow bulk data collection.

1.3. When using, or attempting to use, the GitLab APIs, you agree: [...] 1.3.9. Not to use the GitLab APIs for the bulk collection or scraping of information.

And if you were to remove that clause, I have a feeling that there might be a bug report or two about the pagination API.

983 avatar Sep 18 '22 08:09 983

Since this issue was closed as "completed" instead of closed as "not planned", I was curious whether GitLab support had been implemented. I downloaded a random dump and checked whether it contained gitlab URL, but it did not, so either the traffic is very low, or it has not been implemented.

wget https://data.gharchive.org/2024-01-01-15.json.gz
gunzip 2024-01-01-15.json.gz
grep -P 'url.{1,20}gitlab' 2024-01-01-15.json

983 avatar Feb 02 '24 07:02 983

Apologies, no marked incorrectly: this is a wontfix.

igrigorik avatar Feb 07 '24 23:02 igrigorik

Understandable. Thank you for the clarification.

983 avatar Feb 08 '24 06:02 983