Exclude code from coverage
We should consider excluding code from coverage reports, e.g. plotting, error handling (won't test that anyway), terribly slow parts (unless we find a way to test it nonetheless)...
One way to do that is to use # pragma: no cover, see https://coverage.readthedocs.io/en/7.0.5/excluding.html.
Thoughts?
I wonder if this won't induce more use of # pragma: no cover instead of adding test. And the indication of code coverage is still good in my opinion. Why not just having coverage tests indicated for each PR (with improvements/degradations) rather as condition for the PR to pass CI tests ?
On an other note, if we switch in a near future to jupyter notebooks for project/tutorials description, instead of using one script that generates plots and figures, maybe that could solve the problem (considering that every jupyter would be tested by a simple re-run ...) ?
Well, we can always just ignore degradation.. but if the degradation comes from stuff like plotting, then I fail to see how to test that reasonably.
Yet, if people just put in this pragma everywhere, nothing is gained, of course. But hey, no matter what we decide, this is what reviews are for..
Actually, excluding code from coverage seems to be a research subject itself, see this (interesting) article : https://homepages.dcc.ufmg.br/~andrehora/pub/2021-msr-test-coverage-exclusion.pdf
From a first quick look, seems that the # pragma: no cover approach is actually a good idea, if we already define somewhere what could be excluded and what should not be (and finally, review to assess that :wink:)
I like the statements. Jupyter is all well and good, but I do not want to rely on it. When I am making plots for a talk, for instance, I want nothing to do with Jupyter, but I still want the script to be in the repo. Also, now that you included markdown support with plots, I may choose that over Jupyter.
I tested these pragmas here. Since individual plotting scripts are not part of the functionality that others care about I think we can safely ignore testing them with these statements. "Serious" plotting scripts that generate plots for papers are after all executed by the pipeline.
I am not sure what to do about debugging statements. If I check that someone has put in the correct kind of parameters, I don't want to write a test for every way of triggering a debug statement, but the error messages may be broken, which others may care about.
So I am in favour of excluding plotting only functions but nothing else.
I like the statements. Jupyter is all well and good, but I do not want to rely on it. When I am making plots for a talk, for instance, I want nothing to do with Jupyter, but I still want the script to be in the repo. Also, now that you included markdown support with plots, I may choose that over Jupyter.
yeah ... :sweat_smile: ... this is actually still to be tested for markdown documentation on projects. I'm not totally set on the optimal structure for projects/tutorials/etc ... that would allow markdown documentation to be included in the website along with the sphinx-generated documentation of the API.
But you have a good point tho : jupyter notebooks may be used for tutorial and project description (would be probably better), but you still need scripts to generate figures for article/presentation, tables, etc ... maybe those scripts should have a coverage-exclusion pattern, since I don't know really how to test them without saving figure data somewhere and testing comparison (which seems quite overkill ...)
So I am in favour of excluding plotting only functions but nothing else.
Second that.
What about this (in core.Step.py):

Coverage seems to exclude lines that raises errors, but not all the part that comes before within the check
Yeah, good point. In pyproject.toml, the lines with a raise are excluded, but not the ones before. I guess we COULD decide to exclude error handling, too, although this is not really good practice.
Yeah, good point. In
pyproject.toml, the lines with araiseare excluded, but not the ones before. I guess we COULD decide to exclude error handling, too, although this is not really good practice.
So ... best practice would be to code tests that trigger errors ? Do you really want that for a code like pySDC ? :sweat_smile:
That's my point. This is good practice, but honestly.. if the code fails right before it throws an error anyway... pff..
So, I'd be fine with excluding those blocks, too. At least for now.