devstats.archive icon indicating copy to clipboard operation
devstats.archive copied to clipboard

[feature request] create a dashboard for clone data

Open caniszczyk opened this issue 3 years ago • 7 comments

GitHub has this info available via their builtin dashboards, e.g., https://github.com/cncf/devstats/graphs/traffic

I don't know what the API looks like to pull this but since we have data for stars and forks, maybe we add that to the dashboard: https://kubevirt.devstats.cncf.io/d/3/stars-and-forks-by-repository?orgId=1

Maybe we call it 'stars-forks-and-clones' ;)? or a separate one for just clones

caniszczyk avatar Mar 08 '21 16:03 caniszczyk

I'll research this on Friday, is this OK? We don't use GitHub API in DevStats - we use GitHub archives data.

lukaszgryglicki avatar Mar 09 '21 06:03 lukaszgryglicki

works for me, no rush

On Tue, Mar 9, 2021 at 12:33 AM Łukasz Gryglicki [email protected] wrote:

I'll research this on Friday, is this OK? We don't use GitHub API in DevStats - we use GitHub archives data.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cncf/devstats/issues/288#issuecomment-793454111, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPSIOP7FWTEROU7F5UXBTTCW6LBANCNFSM4YZXZC7Q .

-- Cheers,

Chris Aniszczyk http://aniszczyk.org

caniszczyk avatar Mar 09 '21 14:03 caniszczyk

Doing some research, but I'm quite sure we don't have that data in GitHub archives (which is DevStats' data source), created this issue/question/feature request in the meantime to confirm (now I'm digging several hundreds of megabytes of GHA JSONs to see if there were any data format updates to includ ethis info).

lukaszgryglicki avatar Mar 10 '21 11:03 lukaszgryglicki

I've checked few huge JSONs with a few grep-like approaches (they're over 2.5G in size when converted from ndjson to a correct JSONs) I don't see any data that makes this feature request possible, will also wait for any feedback on my feature request/issue from the previous post.

All I can consider here is to do a hybrid approach - make DevStats also call GitHub APi to get this data - but even if I do so, I can only get last 14 days clones (see API docs) - so I won't be able to get any historical data.

Should I proceed with that hybrid approach @caniszczyk ? If so - then it will take a rather long time - it's somethign. totally new to be implemented.

Will hold until I get feedback - what do do.

lukaszgryglicki avatar Mar 10 '21 11:03 lukaszgryglicki

So @caniszczyk GHA maintainer confirmed that GHA doesn't have that data, so the only possibility is the hybrid approach described here - please let me know if we want to proceed that way? (but I think this is not a really good approach - we cannot get the historical data and we're limited to 14 days days + we need to process GitHubh APi and maintain tokens for few thousands of GitHub repos - this will be slow and actually against a typincal DevStats approach).

lukaszgryglicki avatar Mar 11 '21 07:03 lukaszgryglicki

let's hold off on this feature for now, leave the issue open though and put it on the backlog

On Thu, Mar 11, 2021 at 1:31 AM Łukasz Gryglicki @.***> wrote:

So @caniszczyk https://github.com/caniszczyk GHA maintainer https://github.com/igrigorik confirmed https://github.com/igrigorik/gharchive.org/issues/248#issuecomment-796505931 that GHA doesn't have that data, so the only possibility is the hybrid approach described here https://github.com/cncf/devstats/issues/288#issuecomment-795282524 - please let me know if we want to proceed that way? (but I think this is not a really good approach - we cannot get the historical data and we're limited to 14 days days + we need to process GitHubh APi and maintain tokens for few thousands of GitHub repos - this will be slow and actually against a typincal DevStats approach).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cncf/devstats/issues/288#issuecomment-796529125, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPSINAQPQ5CLNYAAV5ZWTTDBWWTANCNFSM4YZXZC7Q .

-- Cheers,

Chris Aniszczyk http://aniszczyk.org

caniszczyk avatar Mar 11 '21 14:03 caniszczyk

OK.

lukaszgryglicki avatar Mar 11 '21 15:03 lukaszgryglicki