LibreSelery icon indicating copy to clipboard operation
LibreSelery copied to clipboard

Idea for determining weight of contribution

Open erezsh opened this issue 3 years ago • 12 comments

I believe that there is no metric that faithfully correlates to contribution. Trying to measure the number or frequency of commits, lines-of-code, and so forth, will fail, as they are rather arbitrary. And worse, just by measuring them, they will become even further invalidated, as contributors try to game the system (even if with the best intentions).

I think the only way to measure contribution with high reliability is using human evaluation. Of course, direct evaluation tends itself to politics and other such messy things. But it's possible to do so indirectly, in a way that is fairly stable.

The idea is this:

For every issue, several maintainers will estimate how much work is required (i.e. how much time it should take them). The average evaluation will determine the cost of the issue. By submitting a PR that answers the issue and gets accepted, the contributor is awarded "X work done". Funds are then distributed based on amount of work.

Advantages:

  • It's possible to know how much you'll be paid for your contributions, if you stick to issues that already have a stable work-estimate.

  • Hard to game (I think?)

  • By aggregating the subjective experience, it takes into account both time and difficulty

Caveats:

  • Each PR will require an issue

  • Some issues might be harder than they seem. That can be fixed with an automatically growing bonus based on how long the issue remains open.

  • Requires the maintainers to spend more time reviewing issues. Perhaps that could also be encouraged using a monetary reward. Perhaps gamified by rewarding those who were closest to the final estimate.

erezsh avatar Sep 03 '20 09:09 erezsh

Great idea! How could such a work estimation system be implemented on github?

I especially like, that the work estimation benefits developers of higher skill who could finish a task quicker and therefore get a better compensation for their time.

Each PR will require an issue The effort estimation could be applied retroactively to PRs.

The problem I see, is the contribution of the maintainers who might develop on the master branch, to be more productive. Creating a PR for every commit makes you quite unproductive.

In the end, what you proposed is another metric. And metrics have the problems you already mentioned. Therefore they should be combined to have less disadvantages than a single metric.

fdietze avatar Sep 03 '20 09:09 fdietze

@fdietze

I think there will have to be an external site that links to the github issues. (and each issue can contain a link back to that site).

Creating a PR for every commit makes you quite unproductive

The idea is to create a PR for a bunch of commits, not each one. I agree it doesn't solve how to reward those who do casual maintenance and refactoring, which are important. Perhaps those can be represented as a "continuous issue".

P.S. I think this idea can live along-side a bounty system.

P.P.S If we have to pick a metric, I think the highest reward should go for lines deleted 😄

erezsh avatar Sep 03 '20 09:09 erezsh

P.P.S If we have to pick a metric, I think the highest reward should go for lines deleted

I had exactly this idea before! :rofl: Less code is always better. The problem comes when code is moved or svgs are changed...

fdietze avatar Sep 03 '20 09:09 fdietze

I believe that there is no metric that faithfully correlates to the contribution

Please note that LibreSelery takes this into account. We calculate different weights and these are summed up. There will never be one ideal metric that can determine the performance of totally different people. That is why our architecture allows you to add more and more weights. These weights can be balanced between each other depending on your project and community.

P.P.S If we have to pick a metric, I think the highest reward should go for lines deleted smile It would actually be a good metric, but it could result in code that is no longer readable and does not contain comments. Like any weighting it has advantages and disadvantages. Only by combining several weights, you can compensate for this.

This issue relates to: https://github.com/protontypes/libreselery/issues/132

Ly0n avatar Sep 03 '20 09:09 Ly0n

It would actually be a good metric, but it could result in code that is no longer readable and does not contain comments.

Such PRs should simply not be accepted by the community.

fdietze avatar Sep 03 '20 09:09 fdietze

@Ly0n

That sounds like a good approach. It makes sense that a project will have different "budgets", like new features vs. bugfixes vs. maintenance, and I imagine they'll want different metrics for each one.

I also agree that readability should be the goal, and not terseness. They are only somewhat correlated. But it should be a community standard, that doesn't vary by reward. Quality of code is hard to judge, even for humans.

erezsh avatar Sep 03 '20 10:09 erezsh

since seems was not mentioned -- "labels" for PRs/issues are available only to contributors. So they could assign a label from some predefined set for the PR (e.g. from value-tiny to value-huge) which would weigh that PR contribution accordingly whenever it is merged. Then it would be important to relay in README.md or CONTRIBUTING.md the scale of the value-s and what each one typically assigned to, e.g.

  • value-tiny -- typo fixes, and other changes which might even touch lots of code base, are still very much appreciated, but of overall a small bounty value
  • ...
  • value-huge -- developers considered this work of huge value to the project: e.g. a critical fix to a nasty bug; development of a large new component, ...

yarikoptic avatar Sep 03 '20 20:09 yarikoptic

@yarikoptic @erezsh This implementation would actually work. I was also thinking to use emojis as some kind of community votes. We also get them from the Github API: https://developer.github.com/v3/issues/#reactions-summary

Ly0n avatar Sep 04 '20 05:09 Ly0n

to throw my five cents in here:

as @Ly0n pointed out correctly, no single metric is enough to evaluate progress or benefit of actual work done (committed).

I am opposed to the idea of "manual" estimation, evaluation and validation. It simply is not feasible to let humans do that. It not only bases on a subjective moral but also limits the functionality of this process by human effort (which simply scales bad).

i don't want to "plug" myself or my ideas here, but the issue (https://github.com/protontypes/libreselery/issues/132) [#132] is related to this conversation. I did propose my idea of fair metrics ... maybe some people would agree/disagree with my opinion , which would help.

More off topic:

I was having a nice talk with a mechanic (a friend of a friend) working on cars in a garage for a small company in Berlin. I asked him about fairness in payment and what his moral standing is on how to pay people better and what kind of work would have to be compensated in which manner. He of course did not answer my question directly (because it is kind of philosophic) I could extract some valuable information from someone, not really into the field of software nor coding in general (hell, even the product we are talking about is fundamentally different than fixing cars on contracts).

I found the fact that we could agree on stuff so easily without the same context hilarious, here are the summed up results:

  • He agreed that different "types" of work contribute differently to the overall product
    • i.e cleaning, preparations, customer support (talking to people, clarifying changes and so on) , actual implementation (do the fixing),
    • that means all of these jobs are important, but some are more important than others and someone (his chef) has to define those "weights" and these have to be properly communicated throughout all the coworkers.
  • some people are better at some jobs than others, not only in quality but in time, materials needed and so on
    • so for him it is only natural that people can do different tasks and can share the workload in these respective tasks
  • as far as he was concerned, it does not matter to him that others are paid exactly the same as he is, even when these people don`t do all the tasks that matter (or even if they do them "not as good" as him). They are working on the same thing and as long as everyone can live with what he got personally, even if someone did less but contributed the same things than he did, he would be fine if both of them get the same wage overall.

Back to topic:

I think that each and every contribution should be defined as equal. At least, each and every contribution of the same type should be viewed as equal. I do however like the idea of PR qualifiers which further identify the relevance of a contribution. This is not only a helpful thing to add but it is rather convenient to do (i a multitude of ways: labels, upvotes, comments etc.). I would argue that this should be optional and should only "decrease" the relevance not "increase" it.

Meaning that if no one is able to manage a project properly (regardless of the reason), it's default configuration should regard every contribution (of it's type) as equal to others. Putting whatever qualifier on pull requests can downgrade a contribution to be "minimalistic" or "not relevant" or "mediocre". By doing that the person putting on the label is automatically in the spotlight and has to properly communicate his decision (if he has the authority/time) to do it. In short:

  • If I (project owner) do not intervene, the contribution made and all its contents/contributors are equally valued
  • If I do intervene, i have the right do downgrade the contribution, which leaves me in the spot to explain, why I personally think that a contribution is "not so important" (which in itself is kind of ridiculous, that's why i want to avoid it) and I am open to criticism regarding that decision.

kikass13 avatar Sep 04 '20 13:09 kikass13

@Ly0n But github reactions are limited to 8 icons, no? And they would be pretty confusing in this context. I think that if you really must keep in the context of github, it might make more sense to add directives to comments. That would also be much more flexible.

$estimate: 5 weeks

erezsh avatar Sep 04 '20 13:09 erezsh

@erezsh

the whole concept of "time to fix something" is not really helpful. This is not a project management tool. We are not measuring work done. I would rather go away from those "simple" constructs and achieve a better way of understanding fairness in rewards and contributions than "x amount of stuff in t". But maybe that's only me talking.

We are in open source, the fact that people CARE and contribute is not a matter of project management. These are not wages for skilled or unskilled, lazy or zealous, good looking or whatever software developers, artists, writers, or whatever. I would argue, that each and every contribution is a selfless donation of time, work and commitment of a simple person with no ulterior motives.

So each and every contribution has to be respected:

  • even if it's not perfect
  • even if it's changed afterwards and only temporary
  • even if it's a simple thing or a huge game changer

The fact that people care and work together is what should be rewarded.

kikass13 avatar Sep 04 '20 13:09 kikass13

@Ly0n But github reactions are limited to 8 icons, no? And they would be pretty confusing in this context. I think that if you really must keep in the context of github, it might make more sense to add directives to comments. That would also be much more flexible.

$estimate: 5 weeks

Yes, it would be possible to do it in this way.
Maybe you are right. It could be confusing to use the emojis.

@kikass13 @erezsh When multiple users want a weight based on the time estimation made by users we can implement it. LibreSelery should give people the freedom to choose between different weights. If you don't like the weight give it 0 weight or a small weight relative to the others.

But I must give @kikass13 also so far right that time estimations are often very difficult. If you have a clear task like, "Review this code" it is more simple than an issue like "the user experience is lagging". Slow user experience can mean a complete rewrite of your code or maybe a little bug you have.

When issues are written in a way so that it clear what needs to be done it is much easier. It highly depends on the task.

Ly0n avatar Sep 04 '20 16:09 Ly0n