velocity icon indicating copy to clipboard operation
velocity copied to clipboard

Add Xen Project to velocity statistics

Open klogg opened this issue 2 years ago • 8 comments

Xen is a LF project, see official websites at https://xenproject.org Xen and its sub-projects are hosted at https://xenbits.xen.org, main development repository for Xen Hypervisor is at https://xenbits.xen.org/git-http/xen.git

klogg avatar Dec 16 '21 20:12 klogg

Hi, this tracks projects listed on GitHub (using Google Big Query), so Xen cannot be included, unless I go to its source and calculate all stats manually (commits, issues, PRs, authors, and all of this). This requires a lot of extra work (non-automated). cc @caniszczyk - should I try to get all this data for Xen manually (eventually estimate something that is not available) and include that data too?

lukaszgryglicki avatar Dec 17 '21 07:12 lukaszgryglicki

Xen is here: https://github.com/xen-project/

On Fri, Dec 17, 2021 at 1:30 AM Łukasz Gryglicki @.***> wrote:

Hi, this tracks projects listed on GitHub (using Google Big Query), so Xen cannot be included, unless I go to its source and calculate all stats manually (commits, issues, PRs, authors, and all of this). This requires a lot of extra work (non-automated). cc @caniszczyk https://github.com/caniszczyk - should I try to get all this data for Xen manually (eventually estimate something that is not available) and include that data too?

— Reply to this email directly, view it on GitHub https://github.com/cncf/velocity/issues/24#issuecomment-996498190, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPSILV6V346MT372RMWWDURLRHZANCNFSM5KHIGE4Q . You are receiving this because you were mentioned.Message ID: @.***>

-- Cheers,

Chris Aniszczyk https://aniszczyk.org

caniszczyk avatar Dec 17 '21 13:12 caniszczyk

OK, will add on Monday. ANy other projects should I add? Asking because every run BigQuery costs money (we need to track all GitHub data for a year and it dosn';t matter how many new orgs/repos I add).

lukaszgryglicki avatar Dec 17 '21 14:12 lukaszgryglicki

@caniszczyk the Github contains a mirror that seem to be up-to-date, so should be OK @lukaszgryglicki maybe also the Unikraft project (initially developed as a Xen Project subproject)? https://github.com/unikraft

klogg avatar Dec 17 '21 15:12 klogg

Sure, no problem, please include whatever is needed by Monday - I'll do another round then.

lukaszgryglicki avatar Dec 17 '21 15:12 lukaszgryglicki

I can't get correct values for Xen using BigQuery, looks like mirrors are saved differently, investigating this, all I can get for Xen for 2021 year is:

org,repo,activity,comments,prs,commits,issues,authors_alt2,authors_alt1,authors,pushes
xen-project,xen-project/xen,116,0,1,0,0,1,(null),(null),0
xen-project,xen-project/qemu-xen,4,0,0,0,0,1,(null),(null),0
xen-project,xen-project/mini-os,4,0,0,0,0,1,(null),(null),0

lukaszgryglicki avatar Dec 20 '21 06:12 lukaszgryglicki

Added Unikraft data, for Xen I don't have usable data using BigQuery - I'll clone the main repo from GitHub org and will try to estimate numbers using git commands - 2 other repos don't have any activity during the last year, so the only one to analyze is: https://github.com/xen-project/xen

lukaszgryglicki avatar Dec 21 '21 11:12 lukaszgryglicki

Ok so I did some estimations - cloned the xen repo and executed:

root@darkstar:~/dev/cncf/velocity/xen# git log --all --since "2021-01-01" --until "2021-12-11" --pretty=format:"%aE" | sort | uniq | wc -l
71
root@darkstar:~/dev/cncf/velocity/xen# git log --all --since "2021-07-01" --until "2021-12-11" --pretty=format:"%aE" | sort | uniq | wc -l
44
root@darkstar:~/dev/cncf/velocity/xen# git log --all --since "2021-01-01" --until "2021-12-11" --pretty=format:"%H" | sort | uniq | wc -l
1897
root@darkstar:~/dev/cncf/velocity/xen# git log --all --since "2021-07-01" --until "2021-12-11" --pretty=format:"%H" | sort | uniq | wc -l
777

This gives me # of authors and # of commits from either 1/1/21 or 7/1/21 to 12/11/21 (to match data when other projects were generated). There are only 2 PRs on this repo (both closed) so I cannot guess how many issues/PRs should be set for Xen. I need the number of issues and PRs that were created between (4 numbers):

  • 1/1/21 and 12/11/21
  • 7/1/21 and 12/11/21

PRs and issues can be something different on non-Github sources (for example merge requests, bugs, etc.) For now, I did some guesswork and put some values in reports - but I will correct them if I'm given the exact numbers (charts need: number of authors, number of commits, number of issues, and number of PRs). We sort by # of authors - so the position of Xen is OK, its commits value (x-axis) is also OK - only the y-axis value (PRs+Issues is my guesswork currently). The datasheet also lists some more metric values (like comments, activity, pushes etc - but they aren't needed for the charts).

I'm adding a blocked label because I miss data.

lukaszgryglicki avatar Dec 21 '21 11:12 lukaszgryglicki