pitest icon indicating copy to clipboard operation
pitest copied to clipboard

Rounding can give false 100% in html report

Open esovan opened this issue 5 years ago • 13 comments

To avoid false 100% coverage in html report i suggest using floor instead of round

esovan avatar Jun 23 '20 12:06 esovan

I also faced this issue in my project https://github.com/interacto/interacto-java-api/releases in its version 4.3.0. One mutant was not killed but the score was 100%. My mutation threshold was 100% but pitest built. Maybe the score should not be an int value.

arnobl avatar Oct 09 '20 12:10 arnobl

@esovan , is this PR to fix problem https://github.com/hcoles/pitest/issues/375 ?

romani avatar Nov 16 '21 05:11 romani

I think so, its a similar problem at least. We are in our pipeline only using pitest for mutation coverage, so I'm not sure the solution works for line coverage also. The problem i set out to fix was that we could get 100% in the total value, but not in all packages. Example in following picture: image

Looks like the problem is there in line coverage also: image

esovan avatar Nov 16 '21 12:11 esovan

Imho rounding is the better way generally, but it should ensure that it is never rounded to 100%.

That's why I opened https://github.com/hcoles/pitest/pull/638 two years ago that fixes exactly that and is conflicting to this change.

Vampire avatar Nov 16 '21 12:11 Vampire

In my opinion it should never be allowed to round up. Because then it is promising lets say 80% is covered, even though it might just be 79.5%. With this change it will promise only 79% and you get the last .5% as a bonus. I don't see why you would be interested in the coverage if its not to say: "this amount of the code must be covered" or "this amount of the code is definitely covered"

esovan avatar Nov 16 '21 13:11 esovan

Sorry I've been quiet/absent in these discussions. I don't pay much attention to the % myself (I use pitest purely for qualative analysis), but every time the calculations are touched it causes disagreement.

If you scan through the issues you'll find conflicting complaints that it should round up vs down.

If someone can put together a PR that makes it an configuration parameter (and defaults it whichever way they prefer) I'll merge it in.

hcoles avatar Nov 16 '21 13:11 hcoles

#776 is correcting the html report #638 is correcting the percentage computation used by the build (it means, it will fail the build if you have a threshold = 100 % and you still have one mutant alive)

We can probably put the 2 PR in one.

nicolasSoyeur avatar Nov 18 '21 03:11 nicolasSoyeur

This issue "the build is not failing if I have one mutant still allive and my threshold = 100%" is very painfull to my team. We are facing this issue multiple times a month, because our threshold is to 100%. I do agree with @esovan, the round floor for the percentage computation is the more logical way of computing it.

If I have a mutant still alive, Im not into "100 mutation coverage". It is the principile of the mutation test : you want to check if your unit test are bullet proof or not. If you have a mutation coverage of 99.6%, you are not 100% bullet proof :)

nicolasSoyeur avatar Nov 18 '21 03:11 nicolasSoyeur

Off topic:

for qualative analysis

@hcoles , Can you share details ?

romani avatar Nov 19 '21 06:11 romani

@romani Posh way of saying I look at the detailed results (generated frequently) rather than the summary figures. See https://blog.pitest.org/dont-let-your-code-dry/

hcoles avatar Nov 19 '21 08:11 hcoles

thanks a lot for sharing an article.

Right now I understand why you do not understand us. You work in different environment, where coverage is not enforced on fanatic level 100% and other engineers might not be interested to use this tool and care less on testing.

In Checkstyle project we are fanatic on quality and we have 100% branch coverage and almost reached to 100% by pitest on whole code. It helped us tremendously to prove that there is no extra code(code with no effect behavior) in contribution by PR. It helps to let contributor to think on tests more, so as result our code is more likely to stay in form that it is easy to change by somebody else. New contributors do not deal with technical debt from previous contributors, and are not allowed to leave dept for future contributors.

Pitest works super well in more algorithmic code (library), fortunately our project is like this, so 100% fanaticism is possible and reasonable.

romani avatar Dec 01 '21 08:12 romani

I think the problem is when coverage/mutation percentage is used as a threshold for passing builds. The rounded percentage score under-reports missed coverage/mutations, so when used as a threshold, builds may pass despite failure to meet the minimum threshold.

How about adding "missed coverage" and "missed mutation" thresholds, expressed as a maximum count? This is what JaCoCo does, and it works very well for teams that want 100% perfect scores.

Something like:

<execution>
  <id>mutation-coverage</id>
  <goals>
    <goal>mutationCoverage</goal>
  </goals>
  <configuration>
    <missedCoverageThreshold>0</missedCoverageThreshold>
    <missedMutationThreshold>0</missedMutationThreshold>
  </configuration>
</execution>

willhains avatar Dec 26 '21 03:12 willhains

Im trying to add the configuration parameters that @hcoles proposed. I think I did it well for the maven build, but I need to work on the "reporting phase" you can have a look at it here : https://github.com/nicolasSoyeur/pitest/tree/nso/roundUpMutationTotal

I cannot find how to have access to the configurations in the mutation totals computations in the reporting phase. If anyone knows, it could be very helpfull

nicolasSoyeur avatar Jan 15 '22 23:01 nicolasSoyeur

@esovan you could consider whether this PR is obsolete. My #638 was finally merged into 1.13.2, so you should not get false 100% anymore. It still uses round, but uses at max 99% as long as there is at least one remaining mutant.

Vampire avatar May 25 '23 09:05 Vampire

Ah, no, those are different numbers. But I suggest the same logic, properly rounded but at most 99% if there is at least one left.

Vampire avatar May 26 '23 07:05 Vampire

#1221 does exactly that :-)

Vampire avatar May 26 '23 10:05 Vampire

Ok, now this should be obsolete :-)

Vampire avatar May 30 '23 09:05 Vampire

It seems the consensus is that this issue is resolved. I haven't tested this myself but I trust the community :)

esovan avatar May 30 '23 10:05 esovan