lang-box icon indicating copy to clipboard operation
lang-box copied to clipboard

Forward integration results in strange language metrics

Open alaendle opened this issue 4 years ago • 4 comments

If changes are merged (into a PR) they got attributed to the author (of the merge commit) - e.g. I suspect that https://github.com/Azure/iotedge/pull/6114 brings me a lot of Rust-lines, however I have never touched a single line in a rust file.

I Know I kept very short; please let me know if this isn't self explaining or if I could contribute further information.

alaendle avatar Mar 01 '22 08:03 alaendle

Hi, @alaendle

The core of this gist is fetch user commits from GitHub API and counts them. This gist has some filters to ignore not suitable commits but it may not be enough.

https://github.com/inokawa/lang-box/blob/33e344be28186a7a4189a282c89fa56047e7a102/src/index.js#L56-L57

https://github.com/inokawa/lang-box/blob/33e344be28186a7a4189a282c89fa56047e7a102/src/index.js#L79-L80

If I could judge where the commits were came from, I may be able to fix it. However some problems are probably impossible to fix because of the limitation of GitHub API.

inokawa avatar Mar 01 '22 13:03 inokawa

@inokawa Many thanks for your reply. I now understand how deep the rabbit hole goes. The basic problem from my point of view is that YES technically I integrate/push all the commits from MAIN during a FI (the C's on MAIN) - however usually this is never my code and I don't like to count this code to my stats.

(cited from https://docs.microsoft.com/en-us/azure/devops/repos/tfvc/branching-strategies-with-tfvc?view=azure-devops)

However I have to admit that it isn't easy to distinguish the "right" commits from the "wrong" ones - simple there is no direct relationship between the GitHub user and the git user. So no solution would be perfect. The only thing that came to my mind is that some (maybe optional) filtering of the commits with regards to the git user name might solve these kind of misinterpretation.

alaendle avatar Mar 01 '22 16:03 alaendle

The only thing that came to my mind is that some (maybe optional) filtering of the commits with regards to the git user name might solve these kind of misinterpretation.

Looks nice.

It may be possible to put a filter like this

.filter((c) => c.author.name === 'foo')

.filter((c) => c.author.email === 'bar')

to

https://github.com/inokawa/lang-box/blob/33e344be28186a7a4189a282c89fa56047e7a102/src/index.js#L56-L57

https://docs.github.com/en/developers/webhooks-and-events/events/github-event-types#pushevent

Recently I don't have time to work on this, but it's ok to fix your own gist and PRs are also welcomed.

inokawa avatar Mar 02 '22 10:03 inokawa

Hi @inokawa - many thanks for your kind support ❤️. I've just fixed my gist the way you suggested

https://github.com/alaendle/lang-box/commit/c10b93421712231a79348f72edbf5d9ea220e5e6#diff-bfe9874d239014961b1ae4e89875a6155667db834a410aaaa2ebe3cf89820556

My rare last name is finally paying off 😉

However I see no real generic solution - so I haven't created any PR for now.

btw, the famous https://github.com/bokub/github-stats-box suffers similar problems with regards to it's commit count.

alaendle avatar Mar 04 '22 06:03 alaendle