diff_cover
diff_cover copied to clipboard
False negatives for function calls splitted along multiple lines
Hi, thanks for this library. It is very helpful in our project!
I think I found an issue, false negative, that affects overall quality of diff-cover check in a specific use case.
I use diff-cover 6.4.4 with coverage.py in version 6.2 through pytest-cov in version 3.0.0 with Cobertura XML report.
- Recently I updated function call in my production code (the signature of the function I use changed, it has a new argument, and I've added new keyword argument to the call)
- This production call has no test coverage.
- Diff-cover doesn't fail for the line with a new argument in the function call.
Expected:
- Diff-cover fails for the line with a new argument in function call because of lack of the coverage.
Details: I think that probably diff-cover ignores lines without coverage report, and probably (?) it should for every changed line in diff search for first line above with reported coverage. On the image below red line point to lines not covered by tests (GitLab supports Cobertura reports).
$ diff-cover coverage.xml --fail-under 100 --html-report diff-cover.html --compare-branch origin/develop
-------------
Diff Coverage
Diff: origin/develop...HEAD, staged and unstaged changes
-------------
…/file_on_screenshot.py (100%)
…/obfuscated.py (100%)
…/obfuscated2.py (100%)
-------------
Total: 13 lines
Missing: 0 lines
Coverage: 100%
-------------
Reading over the issue it almost sounds like the coverage report was not regenerated after the change?
"I think that probably diff-cover ignores lines without coverage report"
If a line is not in the coverage report at all that suggests the report is stale in some way. If the coverage report was not generated against the most up to date version of the branch I would think the behavior would be undefined. It would not be possible to tell if the intended code was covered or not.
I could be misunderstanding you though.
Do you think it would be possible to make a small example repo showing off this behavior? This is the kind of thing that will be a lot easier to diagnose and fix with a reproducible example.
I've made a small example repository: https://github.com/m-aciek/diff_cover_multiline_signature.
"I think that probably diff-cover ignores lines without coverage report"
XML generated for my example is following:
<class name="app.py" filename="app.py" complexity="0" line-rate="0.8" branch-rate="0">
<methods/>
<lines>
<line number="1" hits="1"/>
<line number="4" hits="1"/>
<line number="5" hits="1"/>
<line number="8" hits="1"/>
<line number="9" hits="0"/>
</lines>
</class>
Coverage.py utility treats first line of a function call as a line to report on. If the call is splitted across multiple lines, change in a line other than first will be missed.
False negative
Index: app.py
<+>UTF-8
===================================================================
diff --git a/app.py b/app.py
--- a/app.py (revision e4d252219edd3574f041a3bf8b29cab2140d80c5)
+++ b/app.py (revision 82bd286ff2e06a65d0474ed7432baa0f9a11be3a)
@@ -8,4 +8,5 @@
def call_hello_world_personalized():
return hello_world_personalized(
name='Maciek',
+ greeting='Hiya',
)
-------------
Diff Coverage
Diff: origin/main...HEAD, staged and unstaged changes
-------------
No lines with coverage information in this diff.
-------------
True positive (single line instead of multiple lines)
Index: app.py
<+>UTF-8
===================================================================
diff --git a/app.py b/app.py
--- a/app.py (revision e4d252219edd3574f041a3bf8b29cab2140d80c5)
+++ b/app.py (revision c31319ffbf10da06e14f03c630c8e0172fdb812c)
@@ -6,6 +6,4 @@
def call_hello_world_personalized():
- return hello_world_personalized(
- name='Maciek',
- )
+ return hello_world_personalized(name='Maciek', greeting='Hiya')
Failure. Coverage is below 100%.
-------------
Diff Coverage
Diff: origin/main...HEAD, staged and unstaged changes
-------------
app.py (0.0%): Missing lines 9
-------------
Total: 1 line
Missing: 1 line
Coverage: 0%
-------------
I am not sure though what approach should be used to fix that.
Thanks, ill make some time in the next few days to look at this closely. I starred and cloned the repo.
@m-aciek
Ok yeah, I see the issue now. Had to run your example myself and really stare at it for a bit. Im actually somewhat shocked this has not come up before.
At the same time I am fairly stumped on how to deal with this.
The coverage report is reporting at the start of the statement. While the diff only cares about the literal line changed.
Each tool is doing the right thing. But it breaks what diff cover tries to do.
To fix this I think diff_cover would have to identify that the line is part of a larger statement. Then identify the start of that statement to realign things. Which, while annoying and complex... would be fine if diff cover was strictly a python tool. But this tool is supposed to support multiple languages.... so I just am not sure how to handle this.
That all being said... I am not against fixing this for python and just letting other languages file similar issues as they are found by actual users (I'm assuming other languages would potentially have similar issues...)
I wanna say the high level steps are
- Identify we are dealing with python (assume if it ends in .py its python.... I feel like thats fine)
- If a line in the diff for that file is not in the coverage report and its not a comment or whitespace -> identify the start of whatever statement that line is in -> identify the line number of that start -> Use that to compare to the report
" identify the start of whatever statement that line is in" seems potentially tricky... maybe the ast module can help? This gets PARTICULARLY hard given I dont know what version of python code im looking at....
Ill be honest, may be a while on this one...
I wonder if the answer here is to report on lines in the diff we could not line up to the report.
Thank you for looking into this. In my opinion there are couple of more ways to go forward with it:
- Investigate if Cobertura XML format is capable of storing metadata about what lines are covered by a statement being reported. And investigate if Coverage could extend the reporting with it.
- Investigate if native coverage metadata format (
.coverageSQLite file) contains such data, about numbers of lines that cover a multiline reported statement. If yes, consider using this data to prepare diff-cover report. - Constructive approach (identifying Python files and identifying multiline statements), what you've described.
- Implementing such functionality: if line from diff is not being reported (is missing in the report), going up line by line to the first line that's being reported and treating it as a coverage result. I am not sure if it doesn't have some caveat that causes that approach to be incorrect.
I wonder if the answer here is to report on lines in the diff we could not line up to the report.
That way we wouldn't be able to make it be not reported unless we reformat the source file by covering and putting whole statement in a one line, so in my opinion it's not right approach.
I think the next step here is to open an issue with coveragepy and see if they have advise. I recall the SQLite database recently added but I vaguely recall that being considered private and not something I would wanna rely on.
The cobertura format last time I tried to investigate it is not terribly well documented.
but Ned is typically pretty helpful and he may have some ideas or thoughts.
I ran into this with a TypeScript project; I made a small PR that modifies a multi-line const declaration which looks like this:
const SomeRecord: Record<string, string[]> = {
Foo: ["Bar"],
Far: ["Boo", "Baz"],
...
};
The project generates coverage report through jest, and the resulting (cobertura) report only includes the const ... line. Library versions used:
- @jest/core 26.6.3
- istanbul-reports 3.1.4 (supports various coverage report formats, including clover)
At the moment, I don't feel like fixing this is in the realm of must-haves for my use case (but still a nice-to-have). diff-cover's report could be treated like an ambient indicator of files that have lots of missing coverage, and in practice, it's probably unlikely that sizable PRs only touch multi-line statements.
Implementing such functionality: if line from diff is not being reported (is missing in the report), going up line by line to the first line that's being reported and treating it as a coverage result. I am not sure if it doesn't have some caveat that causes that approach to be incorrect.
Are there any problems that this would solve? I suppose it makes the assumption that the coverage report generator uses the first line, when they could use the last line, or even something like the middle line, but in reality I'd imagine that most coverage reports standardize on the first line.
If it's risky, maybe it should be an opt-in behavior via some diff-cover command-line flag?
Hello, does any workaround is known as of today? Are there any plans from the discussions with the coverage report tools? And is there anything we could do to maybe help with an implementation like suggested above by christianbundy? Thank you
I opened a PR #416 based on this suggestion:
Implementing such functionality: if line from diff is not being reported (is missing in the report), going up line by line to the first line that's being reported and treating it as a coverage result. I am not sure if it doesn't have some caveat that causes that approach to be incorrect.
It is working well on my tests for Python and reports with pytest. As mentioned, this might be dependent on languages and coverage tools, so further testing and feedbacks from other people may be valuable.