RepoSense
RepoSense copied to clipboard
Different author with same username but different emails will not trigger any warning
What feature(s) would you like to see in RepoSense?
In a similar vein to #436, I believe it will be useful to trigger a warning when there are different authors with the same Git author username, but with different emails. We use email to differentiate whether they are in fact the same user or different authors.
Is the feature request related to a problem? It is very possible that users may coincidentally have the same Git author username configured on their computer (due to them having the same given name). As a result, we may accidentally attribute contributions to the wrong author. Having a warning will similar to that implemented in #436 can be potentially help maintainers of RepoSense dashboards identify if there are potentially any wrong attributions due to users having the same Git author username.
If possible, describe the solution
If applicable, describe alternatives you've considered
Additional context
A potential issue can be that emails are not specified at all. In that case, a standard email is given in the form of [email protected]
. Unfortunately for these cases, I don't believe there is a suitable way to identify.
Another potential issue is that the warning is invalid, as the same user happens to have different email configuration somehow (maybe on two different devices).
Do you have a sample test run on a repo where this might be a problem? It's been awhile since I looked at the line attribution part of the code but I believe it checks both the email (if available) and author id of the commit before attributing it though I could be wrong
I believe I first thought of this issue when there is a misconfiguration in the NUS CS3281 RepoSense dashboard.
There are two students named David in this class. In this example, work was attributed to the wrong David. One of the student uses "david" as his Git author username. The other uses "David Ong".
While this actually only happened due to a misconfiguration of author-config.csv
, it is sort of an incident where attribution is wrong due to similar Git author names. I believe since due to the possibility of common git author usernames being shared, it can potentially be an issue.
I believe the getAuthor being used should be the one in AuthorConfiguration.java.
data:image/s3,"s3://crabby-images/a553e/a553ee0e38d3a83634ce99d55dbbf7f6e7febfed" alt="image"
Here, it is actually being matched sequentially by priority. Author name first, then author email rather than checking both author name and author email together.
I think just changing the order might be enough since if im not wrong, the email is unique as long as it's not the default github one they use as a placeholder.
Hi, may I check if this issue has been resolved?