Is the source code of the PR Insights GitHub App actually available?
https://github.com/apps/pr-insights https://prinsights.searchcode.com/ I saw this app being used here and I assume it is also mostly powered by scc, but clicking on "source" from the second link links back to this repo, and searching "PR Insights", "X/size", "github app" or "issues.opened" yields no result related to this app.
Ah no its not.
Im... not sure about if I should yet actually. Mostly due to not having checked it properly and also to make it hard for someone to just fork it and charge access via github.
I see. That's totally cool but the website is a bit misleading then. The app seems very cool and seems like an improvement over the current app I'm using (https://github.com/apps/pull-request-size), but without being able to audit the app it is a bit concerning with your wording about "AI-powered: adapts to your repo’s trends".
Well "AI" as an if statement ;)
It actually uses the last 100 PR's in order to do this (from memory). So it is technically learning what you are doing. The AI portion was just me being flippant at the time I created the page.
As said, I am not opposed to opening it up, but it was not being advertised as yet beyond my own personal use. I don't know if I could open it up for everyone as there is a material cost to me running it which is partly why I have done neither.
Open to suggestions here BTW. My intent is not to hide anything, but I also don't want someone to just turn it into a $5 application and not help me with the actual costs of running things. I currently run the badges myself, but id love to have it self supporting.
I don't know if I could open it up for everyone as there is a material cost to me running it
I can already directly install the app, is that not intended? This is on a private repo that I just used for testing.
Anyways, I understand where you're coming from and this isn't meant to be a subtle push for you to open source if you don't mean to. I am not familiar with the github apps ecosystem to judge whether your worry is justified or not either. Perhaps it's best to keep things as-is and just update the website to say source is not available.
(But I think I personally won't use your app then, because even ignoring all the "AI" stuff it feels weird to use an app where the algorithm is not documented or auditable, because I wouldn't be able to easily understand what M/complexity means compared to L/complexity)
No I know you can, I just was not advertising it openly.
As mentioned I am very strongly looking at it, once I clean it up and once I come up with some issues.
For now though here is the core of the algorithm with source,
It reads all of the previous PR's for that specific repository, calculates the standard deviation ignoring outlies, then uses that to infer the size. The calculation of the PR itself follows the below,
func (ser *Service) pullRequestAnalyse(pra database.Pr) (ChangeSize, ComplexitySize, error) {
qr := database.New(ser.DbRead)
var changeSize = AverageChange
var complexitySize = AverageComplexity
recent, err := qr.PrRecents(context.Background(), database.PrRecentsParams{
Reponame: pra.Reponame,
Repoowner: pra.Repoowner,
})
if err != nil {
return changeSize, complexitySize, err
}
meanLines, stdLines := common.StandardDeviationNoOutliers(lo.Map(recent, func(item database.PrRecentsRow, index int) float64 {
return float64(item.Pr.Delta)
}), 2.0)
threshold := 0.5
xs := meanLines - threshold*stdLines*2
s := meanLines - threshold*stdLines
l := meanLines + threshold*stdLines
xl := meanLines + threshold*stdLines*2
delta := float64(pra.Delta)
// NB order matters here
switch {
case delta < xs:
changeSize = ExtraSmallChange
case delta < s:
changeSize = SmallChange
case delta > xl:
changeSize = ExtraLargeChange
case delta > l:
changeSize = LargeChange
default:
changeSize = AverageChange
}
meanComplexity, stdComplexity := common.StandardDeviationNoOutliers(lo.Map(recent, func(item database.PrRecentsRow, index int) float64 {
return float64(item.Pr.Complexity)
}), threshold)
cxs := meanComplexity - threshold*stdComplexity*2
cs := meanComplexity - threshold*stdComplexity
cl := meanComplexity + threshold*stdComplexity
cxl := meanComplexity + threshold*stdComplexity*2
cdelta := float64(pra.Complexity)
// NB order matters here
switch {
case cdelta < cxs:
complexitySize = VeryLowComplexity
case cdelta < cs:
complexitySize = LowComplexity
case cdelta > cxl:
complexitySize = ExtraLargeComplexity
case cdelta > cl:
complexitySize = LargeComplexity
default:
complexitySize = AverageComplexity
}
return changeSize, complexitySize, nil
}
This deals with the PR call from github itself, and determines the additions and deletions and complexity based on how scc does it which is just included as a library.
func (ser *Service) ProcessPullRequest(ctx context.Context, client *github.Client, pr PullRequestEventStruct) (PullRequestAnalysis, error) {
files, _, err := client.PullRequests.ListFiles(ctx, pr.RepoOwner, pr.RepoName, pr.Number, &github.ListOptions{
Page: 0,
PerPage: 100,
})
if err != nil {
return PullRequestAnalysis{}, err
}
pra := PullRequestAnalysis{}
for _, file := range files {
if file.Additions != nil {
pra.Additions += *file.Additions
}
if file.Deletions != nil {
pra.Deletions += *file.Deletions
}
if file.Patch != nil {
add, _ := parsePatch(*file.Patch)
c := countCommitFile(file, add)
pra.Complexity += int(c.Complexity)
}
}
return pra, nil
}
I probably will open it up similar to https://github.com/apps/pull-request-size and just YOLO it... if someone takes it... well thats on them. Although I may just make this part AGPL to avoid any issues which might be the solution. It won't be today though because I have other things to do, but check back this time next week and its likely to be here for you to look at, albeit under a AGPL for that portion license.
OK 3.6.0 release done and this code has been cleaned up. Ill port it across sometime and commit.