RepoSense icon indicating copy to clipboard operation
RepoSense copied to clipboard

Different author with same username but different emails will not trigger any warning

Open sikai00 opened this issue 2 years ago • 5 comments

What feature(s) would you like to see in RepoSense?

In a similar vein to #436, I believe it will be useful to trigger a warning when there are different authors with the same Git author username, but with different emails. We use email to differentiate whether they are in fact the same user or different authors.

Is the feature request related to a problem? It is very possible that users may coincidentally have the same Git author username configured on their computer (due to them having the same given name). As a result, we may accidentally attribute contributions to the wrong author. Having a warning will similar to that implemented in #436 can be potentially help maintainers of RepoSense dashboards identify if there are potentially any wrong attributions due to users having the same Git author username.

If possible, describe the solution

If applicable, describe alternatives you've considered

Additional context

A potential issue can be that emails are not specified at all. In that case, a standard email is given in the form of [email protected]. Unfortunately for these cases, I don't believe there is a suitable way to identify.

Another potential issue is that the warning is invalid, as the same user happens to have different email configuration somehow (maybe on two different devices).

sikai00 avatar Feb 06 '23 16:02 sikai00

Do you have a sample test run on a repo where this might be a problem? It's been awhile since I looked at the line attribution part of the code but I believe it checks both the email (if available) and author id of the commit before attributing it though I could be wrong

chan-j-d avatar Feb 07 '23 07:02 chan-j-d

I believe I first thought of this issue when there is a misconfiguration in the NUS CS3281 RepoSense dashboard.

image

There are two students named David in this class. In this example, work was attributed to the wrong David. One of the student uses "david" as his Git author username. The other uses "David Ong".

image

While this actually only happened due to a misconfiguration of author-config.csv, it is sort of an incident where attribution is wrong due to similar Git author names. I believe since due to the possibility of common git author usernames being shared, it can potentially be an issue.

sikai00 avatar Feb 10 '23 11:02 sikai00

I believe the getAuthor being used should be the one in AuthorConfiguration.java.

image

Here, it is actually being matched sequentially by priority. Author name first, then author email rather than checking both author name and author email together.

sikai00 avatar Feb 10 '23 11:02 sikai00

I think just changing the order might be enough since if im not wrong, the email is unique as long as it's not the default github one they use as a placeholder.

chan-j-d avatar Feb 10 '23 12:02 chan-j-d

Hi, may I check if this issue has been resolved?

logical-1985516 avatar Jul 15 '24 10:07 logical-1985516