VstsExtensions icon indicating copy to clipboard operation
VstsExtensions copied to clipboard

BuildQualityChecks Code Coverage TotalLines completely different run-to-run

Open pjohnst5 opened this issue 7 months ago • 7 comments

Describe the context

  • Extension:
    • BuildQualityChecks
  • Environment: are you using Azure DevOps Services (cloud) or Azure DevOps/Team Foundation Server (on-prem)?
    • Azure DevOps
    • Server version: if you are running on-prem, specify the version of your Azure DevOps or Team Foundation Server
  • Agent type: are you running on a Microsoft-hosted or self-hosted agent? (not sure, how can I find that?)
    • Self hosted
    • Agent version: if you are running a self-hosted agent, specify its version
  • Pipeline type: are you using the task in a classic build, class release, or yaml pipeline?
    • Yaml pipeline

Describe the problem and expected behavior We are trying to gate our PRs with the BuildQualityChecks

- task: BuildQualityChecks@8
            displayName: "Check Code Coverage Regression"
            condition: always()
            inputs:
              checkCoverage: true
              coverageFailOption: "build"
              coverageType: "lines"
              fallbackOnPRTargetBranch: false
              baseBranchRef: "master"
              allowCoverageVariance: true
              coverageVariance: 0.25

However, we are seeing a discrepency with the check

We basically see that in ADO output for the task, the totalLines does not match the CodeCoverage report on the build, the total lines is about ~13,000 Pipeline ID: 231094 Build ID: 121023094

Evaluating coverage data from 1 filtered code coverage data sets...
Total lines: 13776
Covered lines: 7004
Code Coverage (%): 50.842
Found baseline build with ID 120928460.
Successfully read code coverage data from build.
Evaluating coverage data from 1 filtered code coverage data sets...
Total lines: 13774
Covered lines: 7002
Code Coverage (%): 50.8349
[SUCCESS] Code coverage policy passed with 50.842% (7004/13776 lines).

But in the Code Coverage tab it shows as ~31,000 lines total Image

We also see, that the baseline run id of 120928460, the baseline used ~31,000 lines as the total..

Evaluating coverage data from 1 filtered code coverage data sets...
Total lines: 31430
Covered lines: 14996
Code Coverage (%): 47.7124
[WARNING] Forcing a new baseline because variable BQC.ForceNewBaseline is set to true. All policies based on the previous build will pass

Task logs Run your pipeline with the following variables:

  • For BuildQualityChecks: System.Debug and BQC.LogRawData set to true
  • For CreateWorkItem: System.Debug set to true
  • For PostBuildCleanup: System.Debug and PBC.LogRawData set to true

Send the task log to [email protected] and reference your GitHub issue. Will do

Attention: The log file may contain sensitive data (e.g., server or organization names, project names, variable information). Please do not attach the log to your GitHub issue and or remove the information from the log file before attaching or sending.

pjohnst5 avatar Apr 14 '25 16:04 pjohnst5

Hi @pjohnst5,

can you please send over the logs for the pipeline?

There's a couple things here I want to explain:

  1. Different values in Code Coverage tab and BQC/build summary
    BQC uses Azure DevOps APIs to read coverage values. I guess that you are using the Cobertura coverage report format, which contains summary and detailed information. Azure DevOps simply takes the summary values from the Cobertura file, and this is also what BQC is using. The Code Coverage tab uses the ReportGenerator library/tool to generate coverage reports. This tool does not look at the summary values in Cobertura. Instead, it has its own logic to calculate both coverable and covered elements. This often leads to different values, which is something we cannot change in BQC. We are planning to move to custom coverage parsing (for some time now), which might change this. For now, please only compare values from Azure DevOps API with the values displayed by BQC, not the values in the Code Coverage tab.

  2. Different values in current and previous build
    In most cases, this happens if there are multiple test runs in the pipeline. In that case, your tests publish multiple code coverage results, which must be merged by the Azure DevOps coverage merge job. This happens asynchronously, so, depending on how much time there is between publishing results and BQC reading the results, this merge might only be partially done. The API should be able to report this, and BQC contains the logic to wait for the final result. However, we have seen many cases in the past where the API reported a final result, which later on changed again. Again, there is nothing we can do from BQC side to fix this. You could try introducing a delay between your test step(s) and the BQC step(s).

Non-Microsoft coverage formats (i.e., everything except the .coverage files) is or has been treated differently, though. We would never merge results but simply keep the first result that was published. There was a change with Publish Code Coverage Results v2 task (see here). This one is capable of merging multiple result files on the agent and then send the already merged results to Azure DevOps. Report Generator has always been able to merge multiple coverage result files and, thus, might again show different values.

Long story short: To better understand what is going on in your case, I need to know how you create coverage results, what coverage format is used, and how the results are published.

ReneSchumacher avatar Apr 15 '25 08:04 ReneSchumacher

I bumped into this issue investigating why I was seeing different coverage numbers between different builds of exactly the same branch (that is, no code changes at all).

@ReneSchumacher, I found point 2 of your comment interesting in this regard, as it may explain the behaviour we are seeing. Nonetheless, I'm a bit confused to be honest whether we are in the exact same situation you pointed out.

We have several test projects which we execute with "dotnet test" using cobertura. We enabled the option 'PublishTestResults' in the dotnet task to have those results available in the azure devops tab. I can see at the end of the test execution that it is uploading these results. Though I imagine these have nothing to do with the code coverage reports that need to be merged as you mention?

Async Command Start: Publish test results
Publishing test results to test run '3559523'.
TestResults To Publish 3, Test run id:3559523
Test results publishing 3, remaining: 0. Test run id: 3559523
Publishing test results to test run '3559525'.
TestResults To Publish 1522, Test run id:3559525
Test results publishing 1000, remaining: 522. Test run id: 3559525
Publishing test results to test run '3559524'.
TestResults To Publish 20, Test run id:3559524
Test results publishing 20, remaining: 0. Test run id: 3559524
Publishing test results to test run '3559527'.
TestResults To Publish 28, Test run id:3559527
Test results publishing 28, remaining: 0. Test run id: 3559527
Publishing test results to test run '3559526'.
TestResults To Publish 376, Test run id:3559526
Test results publishing 376, remaining: 0. Test run id: 3559526
Test results publishing 522, remaining: 0. Test run id: 3559525

Next we have a task (dotnet-coverage merge) that merges the code coverage reports from the different test projects into 1 file After this we execute the PublishCodeCoverageResults task v1 (v2 doesn't support block coverage) pointing to our merged coverage report.

Reading code coverage summary from 'D:\a\_temp\coverage\merged-coverage.xml'
Async Command Start: Publish code coverage
Publishing coverage summary data to TFS server.
 Lines- 26270 of 37271 covered.
 Branches- 3640 of 6399 covered.
Modifying Cobertura Index file
Publishing code coverage files to TFS server.
Uploading 2152 files
Total file: 2152 ---- Processed file: 70 (3%)
Total file: 2152 ---- Processed file: 198 (9%)
Total file: 2152 ---- Processed file: 340 (15%)
Total file: 2152 ---- Processed file: 471 (21%)
Total file: 2152 ---- Processed file: 604 (28%)
Total file: 2152 ---- Processed file: 740 (34%)
Total file: 2152 ---- Processed file: 886 (41%)
Total file: 2152 ---- Processed file: 1041 (48%)
Total file: 2152 ---- Processed file: 1184 (55%)
Total file: 2152 ---- Processed file: 1326 (61%)
Total file: 2152 ---- Processed file: 1466 (68%)
Total file: 2152 ---- Processed file: 1567 (72%)
Total file: 2152 ---- Processed file: 1698 (78%)
Total file: 2152 ---- Processed file: 1834 (85%)
Total file: 2152 ---- Processed file: 1990 (92%)
Total file: 2152 ---- Processed file: 2136 (99%)
File upload succeed.
Published 'D:\a\_temp\cchtml' as artifact 'Code Coverage Report_1027682'
Async Command End: Publish code coverage

The figures listed in the previous output tend to differ if I execute the same build again (no code changes). Example:

Async Command Start: Publish code coverage
Publishing coverage summary data to TFS server.
 Lines- 26060 of 36981 covered.
 Branches- 3308 of 6059 covered.
Modifying Cobertura Index file
Publishing code coverage files to TFS server.
Uploading 2152 files

Right after the Publish Code Coverage Results task we execute the BQC task. Again, the figures printed here differ between builds (same code). In this particular build for which I provided the prior output logs, I saw 0% code coverage, while the next build, numbers matched the "Publish Code Coverage Report" task.

Starting: Check Code Coverage has increased from previous build
==============================================================================
Task         : Build Quality Checks
Description  : Breaks a build based on quality metrics like number of warnings or code coverage.
Version      : 9.2.2
Author       : Microsoft
Help         : [[Docs]](https://github.com/MicrosoftPremier/VstsExtensions/blob/master/BuildQualityChecks/en-US/overview.md)
==============================================================================
Using IdentifierJobResolver
Validating code coverage policy...
Successfully read code coverage data from build.
Total lines: 0
Covered lines: 0
Code Coverage (%): 0
Required Code Coverage (%): 50
[ERROR] The code coverage value (0%, 0 lines) is lower than the minimum value (50%)!

next build:

Starting: Check Code Coverage has increased from previous build
==============================================================================
Task         : Build Quality Checks
Description  : Breaks a build based on quality metrics like number of warnings or code coverage.
Version      : 9.2.2
Author       : Microsoft
Help         : [[Docs]](https://github.com/MicrosoftPremier/VstsExtensions/blob/master/BuildQualityChecks/en-US/overview.md)
==============================================================================
Using IdentifierJobResolver
Validating code coverage policy...
Successfully read code coverage data from build.
Evaluating coverage data from 1 filtered code coverage data sets...
Total lines: 36981
Covered lines: 26060
Code Coverage (%): 70.4686
Required Code Coverage (%): 50
[SUCCESS] Code coverage policy passed with 70.4686% (26060/36981 lines).

I'm a bit puzzled to say the least. Is this related to what you explained, or should I be looking for other explanations?

Thanks already.

ptemmer avatar May 20 '25 10:05 ptemmer

Seems all of our issues were solved by using ReportGenerator instead of PublishCodeCoverageReports. Numbers don't jump around anymore between builds, and BGC matches numbers in the HTML report.

ptemmer avatar May 23 '25 09:05 ptemmer

Hi @ptemmer,

sorry for the delayed response. First, I'm happy that you could resolve your problem by using ReportGenerator. Can you maybe give a little more detail about what you did? Afaik, ReportGenerator can merge coverage results and generate reports but not publish results to Azure DevOps (if I'm not missing anything).

I find it a little strange that the Publish Code Coverage Results task reports two different values when you run it for the same build. Are you sure that the merged file hasn't changed?

When BQC reports zero data without waiting, it usually means that there is no coverage data. Usually, you publish coverage data, the job runs in the background, and this will lead to an API result that makes BQC wait. Only if the API result states that there is no data and merge jobs have completed or if BQC runs into a timeout after waiting for some time should you otherwise see zero coverage.

Maybe the agent was still uploading the coverage file when BQC was executing? I have seen rare instances of race conditions like this.

ReneSchumacher avatar May 23 '25 10:05 ReneSchumacher

Hello @ReneSchumacher,

I basically just replaced the PublishCodeCoverageResults task by the ReportGenerator task, and removed the "dotnet-coverage merge" CLI command step. The problem might have been perfectly related to the dotnet-coverage tool, nonetheless, using PublishCodeCoverageResults@2 (which is able to merge too) wasn't an option, as it doesn't include support for branch coverage.

As for the 0% code coverage detected by BGQ, could that have been a racing condition if both tasks are part of the same pipeline job? One doesn't start until the previous task finished. Unless more processing is done on the DevOps API end in between....

ptemmer avatar May 23 '25 14:05 ptemmer

Hi @ReneSchumacher

I guess I cheered a bit too early. While different coverage numbers run-to-run are not an issue anymore, at random times I keep running into BGQ obtaining zeros:

Task         : Build Quality Checks
Description  : Breaks a build based on quality metrics like number of warnings or code coverage.
Version      : 9.2.2
Author       : Microsoft
Help         : [[Docs]](https://github.com/MicrosoftPremier/VstsExtensions/blob/master/BuildQualityChecks/en-US/overview.md)
==============================================================================
Using IdentifierJobResolver
Validating code coverage policy...
Successfully read code coverage data from build.
Total lines: 0
Covered lines: 0
Code Coverage (%): 0
No baseline build for current branch found. Looking for builds on pull request target branch.
Found baseline build with ID 1032067.
Successfully read code coverage data from build.
Evaluating coverage data from 1 filtered code coverage data sets...
Total lines: 86217
Covered lines: 28852
Code Coverage (%): 33.4644

I noticed that once it fails for a pipeline, it always fails, no matter how many times I rerun the failed job. However, on the next pipeline run it "MAY" succeed.

I even tried introducing a 30s wait between the ReportGenerator upload and the BGC tasks, to no avail. FYI; I do see the correct figures in the ReportGenerator Upload task.

Is there a way to check the Azure DevOps API progress on processing the report, to double check whether it's actually processing anything at all?

ptemmer avatar May 30 '25 11:05 ptemmer

We also have intermittent impact from run to run

where initial run shows 0 and then we requeue and the follow up run shows the actual amount

Before requeue: Image

After requeue (rerunning failed jobs don't work): Image

realgio95 avatar Sep 11 '25 18:09 realgio95