Diff result is always empty
Hello!
I have two cobertura files and I try to get diff between them. Source code for both of them is the same because they are produced by two different services which live in the same monorepository. So my command looks like this:
pycobertura diff --source1 . --source2 . cobertura-coverage.xml cobertura-coverage2.xml
The problem is that it returns an empty diff which looks like this:
Filename Stmts Miss Cover Missing
---------- ------- ------ ------- ---------
TOTAL - -
I thought that maybe my xml files are corrupted or something like this but it seems that show command works fine with them.
What can be the possible cause of such behavior?
Thanks!
Do you get anything by doing:
pycobertura diff cobertura-coverage.xml cobertura.coverage2.xml
... ? If not, are the two files identical?
Hi @aconrad! As far as I remember in that way I don’t get anything either No, they are not identical
Hm. Any chance those coverage reports can be shared so we can take a look at them? (without the source)
@aconrad It seems I found the reason. I was really using the wrong file in comparison. But after I resolved this issue new ones appeared. Can you please tell me if there is a way to understand why diff ends up with exit code "2"? What is the exact reason?
Hi @GiddeonWyeth, the pycobertura diff exit codes are documented here: https://github.com/aconrad/pycobertura/#diff-exit-codes
Let me know if that answers your question.
Hello @aconrad. More or less yes, thanks!
I can't understand why when two coverage files are being compared pycobertura does this in a strange way.
For ex. : if in the first coverage we don't have a file named foo.py and in the second one we do than in resulting diff we will see that file and all covered lines there will be marked as ones with increased coverage and the ones which were not will be marked as decreased
Why don't we skip that file?
That might be due to the fact that you are comparing the same code base with two coverage files that were generated from different code base (or the same code base at different versions). If the foo.py did not exist at the time the first coverage file was generated then pycobertura has no idea that it wasn't present if it exists in the code base you are giving it for --source1.
So ideally you would want a copy of your code base at the commit version that generated the first coverage file (that doesn't contain foo.py), and then another copy at the commit version that generated the second coverage. And point --source1 and --source2 to their respective code base versions.
@aconrad That's exactly what I do. The thing is that I try to compare result of manual coverage and auto-test coverage. Some manual tests execute code from files which are not touched by auto-tests at all. In such case as a result I have 2 coverage files with the same code base but different contents
Let me know if that works for you.
Arguably, regarding my previous comment, pycobertura could know that foo.py isn't present if it's not listed in the first coverage file. I'd have to check again what it does exactly under the hood... That's said, a reproducible use case for me to test would be ideal to help debug.
But, matching the code base that generated the coverage file is still the more accurate way to go about it.
@aconrad Please take a look. I use pycobertura diff --format html --output coverage-diff.html --source1 . --source2 . cobertura-coverage2.xml cobertura-coverage.xml to form diff. In resulting file you will see changes for validation-utils.js which are incorrect since second file doesn't have info about coverage of this file. Such behavior can lead to incorrect conclusion for ones who analyze the results
pycobetura.zip
Let me apologize in advance if I didn't understand you, but given the command you provided, the output is what I would expect.
You pass cobertura-coverage2.xml as first argument and cobertura-coverage.xml as second argument. Pycobertura interprets the first one as being the old file and the second one as being the new one. Here, cobertura-coverage.xml is considered the new one, it will assume that lib/util/validation-utils.js has been kept and add_actor_to_request.js is no longer present in the new version.
This behavior is similar to what the linux diff command does. If we pass the files in the same order:
diff -u cobertura-coverage2.xml cobertura-coverage.xml
... in the diff we can see those lines:
[... snip ...]
- <class name="add_actor_to_request.js" filename="lib/util/add_actor_to_request.js" line-rate="0.2" branch-rate="0">
+ <class name="validation-utils.js" filename="lib/util/validation-utils.js" line-rate="0.060599999999999994" branch-rate="0">
[... snip ...]
... which shows that validation-utils.js is kept, while add_actor_to_request.js is removed.
Does that make sense?
@aconrad The thing is that it is not actually possible to understand that the file appeared in the new coverage
When I take a look at the diff I see that in validation-utils.js some lines increased its coverage and some decreased while in fact non of them had had any coverage at all. From my perspective the valid behavior would be to highlight the lines which started being covered and mark uncovered lines as the ones which didn't change
What do you think?
@aconrad friendly reminder about this discussion :)
Let me add this screenshot from your code in pycobertura.zip that was generated with the following command to make sure we are looking at the same thing:
pycobertura diff \
--format html \
--output coverage-diff.html \
--source1 . \
--source2 . \
cobertura-coverage2.xml cobertura-coverage.xml
You are saying that in validation-utils.js there are lines that have increased in coverage and some that decreased and you are saying that file had no coverage at all. By looking at the screenshot, I would agree with this statement. We can see that line 5 and line 110 are "covered", but that's really just Javascript loading the code in memory so it effectively executes the top-level statements of the file, thus marking those lines as "covered" (not because a test was written but simply because Javascript executed them).
Then I see that the body of the methods isBoolean(), isString(), and isInt() have not been executed by Javascript, thus they aren't being marked as covered in the coverage report.
I can also see that the + sign at the beginning of the lines indicates that they are new lines (they didn't exist in cobertura-coverage2.xml, the first report passed as an argument). When there is a + sign, pycobertura will highlight each line and mark them as covered or uncovered with respect to the coverage file passed as the second argument, cobertura-coverage.xml.
The way to think about pycobertura diff is that we want to bring the developer's attention to any change in status between the previous coverage and the current coverage:
- the lines that changed or were modified (with a
+sign) are lines that the developer introduced, so we want to show the coverage status for those lines. We are answering the question "Are all lines changed by the developer covered?" - the lines that had a change in coverage status (without a
+sign). This happens when the developer adds/changes/removes tests that affect parts of the code that the developer hasn't modified. We are answering the question "Has the coverage status changed for lines that the developer didn't touch?"
From my perspective the valid behavior would be to highlight the lines which started being covered and mark uncovered lines as the ones which didn't change What do you think?
Can you explain why that would be useful? If we mark uncovered the lines that didn't change, wouldn't that be the behavior of pycobertura show which shows the complete coverage status of each file such that each file is marked as covered or uncovered? pycobertura diff works differently, as described above.
Closing due to inactivity.