github-awards
github-awards copied to clipboard
[Feature Request] Support for repo in organization (use more accurate ranking method)
Star-based ranking from our own repository is alright for a start. However, this doesn't take into account contributions to someone else's repository.
For example, Jonny has one repo with 100 stars. Andy has no repo, but is a regular open-source contributor and has more than 100 commits on Linux repo (which has almost 20K stars)
It would be much better if the ranking method takes contributions into account as well.
It's great!
I was thinking of adding something similar, that's a great idea.
We could use the number of contributions to the repo : https://developer.github.com/v3/repos/#list-contributors
That could work, too. Even though one commit to a 1-star repo is for sure can be regarded less than one commit to a 100-star repo.
I mean, ranking algorithm is in another domain in itself.
So, it's up to you how close you want to represent the rank in reality.
I definitely agree that it's worth noting contributions made to repositories owned by others, as well as repositories that you are a collaborator on.
I won't even attempt to write an equation for this, or write how this data could be retrieved, but the following items I think would all play into a 'ranking':
- stars on repositories you own
- stars on repositories you are a collaborator on
- weighted somewhat less than stars on repos you own
- could possibly include weighting similar to that listed below
- stars on repositories you have contributed to
- 'size' of contributions should have weight (lines added/removed? commits? merged PRs?)
- 'staleness' of contributions should have weight (more recent contributions count for more?)
This way, someone who makes a small contribution to a large repo like node would still get 'credit' for it, but not as much as say, someone who's a collaborator on express. And then the 'value' of those contributions could erode over time (this might be debatable).
Anyway, a thought I had and wanted to share on this issue.
:+1:
A quick update on that topic : I'm very interested in having this feature, and i'm looking for someone to help me build it.
Getting contributions to other repository for all users is a lot of data. I think a good start would be to extract data from Github Archive with Google Big Query, then download and process the json with a rake task (similar to what i did for the first batch import to kick start the project)
If someone wants to try to open a PR for this we can discuss this further. For now i'm leaving this issue open.
The GitHub API allows you to access the contributor list with stats directly btw:
https://developer.github.com/v3/repos/statistics/#contributors
So one would not need to actually look at the full git commit history.
Yep, but calling this for 18 Million repo, with a 5000 req/hour rate limit is not the way to go. Even if we only consider active repo it won't work...
Github Archive + Google big query to filter data, is the best way i can think of to get exhaustive data
+1, I also think rankings won't really be correct unless repositories to other organizations (which are 99,99% for some of us) are also taken into account.
+1 on this, most of my stars are in an org repo and they are not accounted for.