scc icon indicating copy to clipboard operation
scc copied to clipboard

Is the source code of the PR Insights GitHub App actually available?

Open Glinte opened this issue 2 months ago • 6 comments

https://github.com/apps/pr-insights https://prinsights.searchcode.com/ I saw this app being used here and I assume it is also mostly powered by scc, but clicking on "source" from the second link links back to this repo, and searching "PR Insights", "X/size", "github app" or "issues.opened" yields no result related to this app.

Glinte avatar Oct 23 '25 13:10 Glinte

Ah no its not.

Im... not sure about if I should yet actually. Mostly due to not having checked it properly and also to make it hard for someone to just fork it and charge access via github.

boyter avatar Oct 24 '25 02:10 boyter

I see. That's totally cool but the website is a bit misleading then. The app seems very cool and seems like an improvement over the current app I'm using (https://github.com/apps/pull-request-size), but without being able to audit the app it is a bit concerning with your wording about "AI-powered: adapts to your repo’s trends".

Glinte avatar Oct 24 '25 02:10 Glinte

Well "AI" as an if statement ;)

It actually uses the last 100 PR's in order to do this (from memory). So it is technically learning what you are doing. The AI portion was just me being flippant at the time I created the page.

As said, I am not opposed to opening it up, but it was not being advertised as yet beyond my own personal use. I don't know if I could open it up for everyone as there is a material cost to me running it which is partly why I have done neither.

Open to suggestions here BTW. My intent is not to hide anything, but I also don't want someone to just turn it into a $5 application and not help me with the actual costs of running things. I currently run the badges myself, but id love to have it self supporting.

boyter avatar Oct 24 '25 02:10 boyter

I don't know if I could open it up for everyone as there is a material cost to me running it

I can already directly install the app, is that not intended? This is on a private repo that I just used for testing.

Image

Anyways, I understand where you're coming from and this isn't meant to be a subtle push for you to open source if you don't mean to. I am not familiar with the github apps ecosystem to judge whether your worry is justified or not either. Perhaps it's best to keep things as-is and just update the website to say source is not available.

(But I think I personally won't use your app then, because even ignoring all the "AI" stuff it feels weird to use an app where the algorithm is not documented or auditable, because I wouldn't be able to easily understand what M/complexity means compared to L/complexity)

Glinte avatar Oct 24 '25 02:10 Glinte

No I know you can, I just was not advertising it openly.

As mentioned I am very strongly looking at it, once I clean it up and once I come up with some issues.

For now though here is the core of the algorithm with source,

It reads all of the previous PR's for that specific repository, calculates the standard deviation ignoring outlies, then uses that to infer the size. The calculation of the PR itself follows the below,

func (ser *Service) pullRequestAnalyse(pra database.Pr) (ChangeSize, ComplexitySize, error) {
	qr := database.New(ser.DbRead)

	var changeSize = AverageChange
	var complexitySize = AverageComplexity

	recent, err := qr.PrRecents(context.Background(), database.PrRecentsParams{
		Reponame:  pra.Reponame,
		Repoowner: pra.Repoowner,
	})
	if err != nil {
		return changeSize, complexitySize, err
	}

	meanLines, stdLines := common.StandardDeviationNoOutliers(lo.Map(recent, func(item database.PrRecentsRow, index int) float64 {
		return float64(item.Pr.Delta)
	}), 2.0)

	threshold := 0.5

	xs := meanLines - threshold*stdLines*2
	s := meanLines - threshold*stdLines
	l := meanLines + threshold*stdLines
	xl := meanLines + threshold*stdLines*2

	delta := float64(pra.Delta)

	// NB order matters here
	switch {
	case delta < xs:
		changeSize = ExtraSmallChange
	case delta < s:
		changeSize = SmallChange
	case delta > xl:
		changeSize = ExtraLargeChange
	case delta > l:
		changeSize = LargeChange
	default:
		changeSize = AverageChange
	}

	meanComplexity, stdComplexity := common.StandardDeviationNoOutliers(lo.Map(recent, func(item database.PrRecentsRow, index int) float64 {
		return float64(item.Pr.Complexity)
	}), threshold)

	cxs := meanComplexity - threshold*stdComplexity*2
	cs := meanComplexity - threshold*stdComplexity
	cl := meanComplexity + threshold*stdComplexity
	cxl := meanComplexity + threshold*stdComplexity*2

	cdelta := float64(pra.Complexity)

	// NB order matters here
	switch {
	case cdelta < cxs:
		complexitySize = VeryLowComplexity
	case cdelta < cs:
		complexitySize = LowComplexity
	case cdelta > cxl:
		complexitySize = ExtraLargeComplexity
	case cdelta > cl:
		complexitySize = LargeComplexity
	default:
		complexitySize = AverageComplexity
	}

	return changeSize, complexitySize, nil
}

This deals with the PR call from github itself, and determines the additions and deletions and complexity based on how scc does it which is just included as a library.


func (ser *Service) ProcessPullRequest(ctx context.Context, client *github.Client, pr PullRequestEventStruct) (PullRequestAnalysis, error) {
	files, _, err := client.PullRequests.ListFiles(ctx, pr.RepoOwner, pr.RepoName, pr.Number, &github.ListOptions{
		Page:    0,
		PerPage: 100,
	})
	if err != nil {
		return PullRequestAnalysis{}, err
	}

	pra := PullRequestAnalysis{}
	for _, file := range files {
		if file.Additions != nil {
			pra.Additions += *file.Additions
		}
		if file.Deletions != nil {
			pra.Deletions += *file.Deletions
		}

		if file.Patch != nil {
			add, _ := parsePatch(*file.Patch)

			c := countCommitFile(file, add)
			pra.Complexity += int(c.Complexity)
		}
	}

	return pra, nil
}

I probably will open it up similar to https://github.com/apps/pull-request-size and just YOLO it... if someone takes it... well thats on them. Although I may just make this part AGPL to avoid any issues which might be the solution. It won't be today though because I have other things to do, but check back this time next week and its likely to be here for you to look at, albeit under a AGPL for that portion license.

boyter avatar Oct 24 '25 03:10 boyter

OK 3.6.0 release done and this code has been cleaned up. Ill port it across sometime and commit.

boyter avatar Oct 31 '25 05:10 boyter