github-awards icon indicating copy to clipboard operation
github-awards copied to clipboard

[Feature Request] Support for repo in organization (use more accurate ranking method)

Open fadils opened this issue 10 years ago • 10 comments

Star-based ranking from our own repository is alright for a start. However, this doesn't take into account contributions to someone else's repository.

For example, Jonny has one repo with 100 stars. Andy has no repo, but is a regular open-source contributor and has more than 100 commits on Linux repo (which has almost 20K stars)

It would be much better if the ranking method takes contributions into account as well.

fadils avatar Mar 03 '15 00:03 fadils

It's great!

marioidival avatar Mar 03 '15 00:03 marioidival

I was thinking of adding something similar, that's a great idea.

We could use the number of contributions to the repo : https://developer.github.com/v3/repos/#list-contributors

vdaubry avatar Mar 06 '15 11:03 vdaubry

That could work, too. Even though one commit to a 1-star repo is for sure can be regarded less than one commit to a 100-star repo.

I mean, ranking algorithm is in another domain in itself.

So, it's up to you how close you want to represent the rank in reality.

fadils avatar Mar 06 '15 12:03 fadils

I definitely agree that it's worth noting contributions made to repositories owned by others, as well as repositories that you are a collaborator on.

I won't even attempt to write an equation for this, or write how this data could be retrieved, but the following items I think would all play into a 'ranking':

  • stars on repositories you own
  • stars on repositories you are a collaborator on
    • weighted somewhat less than stars on repos you own
    • could possibly include weighting similar to that listed below
  • stars on repositories you have contributed to
    • 'size' of contributions should have weight (lines added/removed? commits? merged PRs?)
    • 'staleness' of contributions should have weight (more recent contributions count for more?)

This way, someone who makes a small contribution to a large repo like node would still get 'credit' for it, but not as much as say, someone who's a collaborator on express. And then the 'value' of those contributions could erode over time (this might be debatable).

Anyway, a thought I had and wanted to share on this issue.

jackwanders avatar Mar 17 '15 13:03 jackwanders

:+1:

RaVbaker avatar Apr 02 '15 08:04 RaVbaker

A quick update on that topic : I'm very interested in having this feature, and i'm looking for someone to help me build it.

Getting contributions to other repository for all users is a lot of data. I think a good start would be to extract data from Github Archive with Google Big Query, then download and process the json with a rake task (similar to what i did for the first batch import to kick start the project)

If someone wants to try to open a PR for this we can discuss this further. For now i'm leaving this issue open.

vdaubry avatar Apr 07 '15 10:04 vdaubry

The GitHub API allows you to access the contributor list with stats directly btw:

https://developer.github.com/v3/repos/statistics/#contributors

So one would not need to actually look at the full git commit history.

astrofrog avatar Apr 07 '15 10:04 astrofrog

Yep, but calling this for 18 Million repo, with a 5000 req/hour rate limit is not the way to go. Even if we only consider active repo it won't work...

Github Archive + Google big query to filter data, is the best way i can think of to get exhaustive data

vdaubry avatar Apr 07 '15 11:04 vdaubry

+1, I also think rankings won't really be correct unless repositories to other organizations (which are 99,99% for some of us) are also taken into account.

danielfernandez avatar Sep 04 '15 09:09 danielfernandez

+1 on this, most of my stars are in an org repo and they are not accounted for.

sztomi avatar Mar 18 '19 22:03 sztomi