gcovr icon indicating copy to clipboard operation
gcovr copied to clipboard

C++ constructors and destructors in Cobertura output

Open Pesa opened this issue 1 year ago • 21 comments

Describe the bug

I'm currently experiencing an issue with the cobertura output where gcovr is generating two entries for the same destructor. The relevant portion of the generated report looks like this:

            <method name="ndn::Face::~Face" signature="()" line-rate="1.0" branch-rate="1.0" complexity="0.0">
              <lines/>
            </method>
            <method name="ndn::Face::~Face" signature="()" line-rate="1.0" branch-rate="0.5" complexity="0.0">
              <lines>
                <line number="148" hits="727" branch="true" condition-coverage="50% (2/4)">
                  <conditions>
                    <condition number="0" type="jump" coverage="50%"/>
                  </conditions>
                </line>
              </lines>
            </method>

The problem is that the cobertura parser in our Jenkins CI doesn't seem to like the duplicate method.

java.lang.IllegalArgumentException: There is already the same child [METHOD] ndn::Face::~Face() <0> with the name ndn::Face::~Face() in [CLASS] face_cpp <11, LINE: 75.34% (55/73)>

It turns out that these destructors are really two compiler-generated variants of the same source-level destructor. This is clear from looking at the mangled names in the json report:

                {
                    "name": "_ZN3ndn4FaceD0Ev",
                    "demangled_name": "ndn::Face::~Face()",
                    "lineno": 148,
                    "execution_count": 1,
                    "blocks_percent": 100.0,
                    "pos": [
                        "148:1",
                        "148:1"
                    ]
                },
                {
                    "name": "_ZN3ndn4FaceD2Ev",
                    "demangled_name": "ndn::Face::~Face()",
                    "lineno": 148,
                    "execution_count": 242,
                    "blocks_percent": 100.0,
                    "pos": [
                        "148:1",
                        "148:1"
                    ]
                },

To Reproduce

Steps to reproduce the behavior:

  1. ...
  2. ...

Expected behavior

I'm not actually sure what the behavior should be... Maybe the coverage data for the two variants of the destructor should be merged somehow? Or discard one of the two?

I also wonder if/how the HTML reporter handles this case, since it doesn't seem to be affected (I tried with --html-details and there was no error).

Desktop (please complete the following information)

  • OS: Ubuntu 24.10
  • GCC version 14.2.0
  • GCOVR version 8.3
  • Project directory layout:
    • Roots
    • Object directories

Additional context

We started seeing this error as soon as we upgraded our code coverage CI from GCC 11 + gcovr 5.2 to GCC 14 + gcovr 8.3.

Pesa avatar Mar 28 '25 21:03 Pesa

#1059 seems to be related. However, the description of that PR mentions different compiler versions while we are using the same exact compiler for everything.

Pesa avatar Mar 28 '25 21:03 Pesa

You're right, this part is commented in https://github.com/gcovr/gcovr/pull/1059#issuecomment-2631549943. For now this should be done in a separate PR. I'll check on it.

Spacetown avatar Mar 29 '25 18:03 Spacetown

Upon further investigation, the "duplicate method" issue affects more than just destructors. For instance, when a method has both const and non-const overloads (very common for begin()/end() in C++), the const gets lost somewhere and the final Cobertura XML contains two entries that look identical:

<method name="ndn::security::AdditionalDescription::begin[abi:cxx11]" signature="()" line-rate="1.0" branch-rate="1.0" complexity="0.0">
<method name="ndn::security::AdditionalDescription::begin[abi:cxx11]" signature="()" line-rate="1.0" branch-rate="1.0" complexity="0.0">

but in reality one is for the const overload and the other is for the non-const overload. The JSON output clearly shows the two methods with (obviously) different mangled names:

                {
                    "name": "_ZN3ndn8security21AdditionalDescription5beginB5cxx11Ev",
                    "demangled_name": "ndn::security::AdditionalDescription::begin[abi:cxx11]()",
                    "lineno": 59,
                    "execution_count": 1,
                    "blocks_percent": 100.0,
                    "pos": [
                        "59:1",
                        "62:1"
                    ]
                },
                {
                    "name": "_ZNK3ndn8security21AdditionalDescription5beginB5cxx11Ev",
                    "demangled_name": "ndn::security::AdditionalDescription::begin[abi:cxx11]() const",
                    "lineno": 71,
                    "execution_count": 1,
                    "blocks_percent": 100.0,
                    "pos": [
                        "71:1",
                        "74:1"
                    ]
                }

Conceptually, this is quite different from the case of the destructors, because in this case there are actually two completely distinct implementations (const vs non-const) in the source code, while for the destructor case there will always be (at most) one definition in the source code. Therefore in this case the coverage data should not be merged, but I still must be able to distinguish the two overloads in the Cobertura output. Please let me know if I should open a separate bug for this.

Pesa avatar Mar 29 '25 20:03 Pesa

I'm on it. Is here signature="() const" correct?

Spacetown avatar Mar 29 '25 20:03 Spacetown

@Pesa Please can you check the linked PR?

Spacetown avatar Mar 29 '25 21:03 Spacetown

There is a similar problem with a constructor now (this is with #1085)

_ZN3ndn13OBufferStreamC1Ev vs. _ZN3ndn13OBufferStreamC2Ev

(ERROR) GCOV produced the following errors processing /home/jenkins/workspace/ndn-cxx/OS/code-coverage/build/ndn-cxx/encoding/buffer-stream.cpp.2.gcda:
	Function mangled name must be equal, got _ZN3ndn13OBufferStreamC1Ev and _ZN3ndn13OBufferStreamC2Ev.
	GCOV data file of merge source is:
	   /home/jenkins/workspace/ndn-cxx/OS/code-coverage/buffer-stream.cpp.2##17f0e76ec4fd57b60f0d98f002698143.gcov.json.gz
	and of merge target is:
	   /home/jenkins/workspace/ndn-cxx/OS/code-coverage/buffer-stream.cpp.2##17f0e76ec4fd57b60f0d98f002698143.gcov.json.gz
	(gcovr could not infer a working directory that resolved it.)
To ignore this error use option --gcov-ignore-errors=no_working_dir_found.
(ERROR) Traceback (most recent call last):
  File "/home/jenkins/.cache/uv/archive-v0/0LOLB3V9p-nZukZ_5mJQH/lib/python3.12/site-packages/gcovr/formats/gcov/workers.py", line 95, in worker
    work(*args, **kwargs)
  File "/home/jenkins/.cache/uv/archive-v0/0LOLB3V9p-nZukZ_5mJQH/lib/python3.12/site-packages/gcovr/formats/gcov/read.py", line 498, in process_datafile
    raise RuntimeError(errors_output)
RuntimeError: GCOV produced the following errors processing /home/jenkins/workspace/ndn-cxx/OS/code-coverage/build/ndn-cxx/encoding/buffer-stream.cpp.2.gcda:
	Function mangled name must be equal, got _ZN3ndn13OBufferStreamC1Ev and _ZN3ndn13OBufferStreamC2Ev.
	GCOV data file of merge source is:
	   /home/jenkins/workspace/ndn-cxx/OS/code-coverage/buffer-stream.cpp.2##17f0e76ec4fd57b60f0d98f002698143.gcov.json.gz
	and of merge target is:
	   /home/jenkins/workspace/ndn-cxx/OS/code-coverage/buffer-stream.cpp.2##17f0e76ec4fd57b60f0d98f002698143.gcov.json.gz
	(gcovr could not infer a working directory that resolved it.)
To ignore this error use option --gcov-ignore-errors=no_working_dir_found.

Pesa avatar Mar 29 '25 22:03 Pesa

Btw what does "gcovr could not infer a working directory that resolved it" mean?

Pesa avatar Mar 29 '25 22:03 Pesa

Btw what does "gcovr could not infer a working directory that resolved it" mean?

The merge errors are catched and added to the default message. This doesn't make sense here.

Spacetown avatar Mar 30 '25 21:03 Spacetown

Tried with the latest version of #1085 (commit de2a24fad81c5dbf9824c1649f97c67f0ae56739) and gcovr finishes successfully now. Also, jenkins is happy with the generated cobertura XML. There is a slight discrepancy (~0.1%) in the number of covered lines between the native gcovr summary and the cobertura report, and a slightly larger (~1%) discrepancy for function coverage, not sure what's causing that. On the other hand the numbers for branch coverage are identical.

Pesa avatar Mar 30 '25 22:03 Pesa

Another error with constructors with commit d356ef987509c12124354ef276544635425d82f9:

(INFO) Reading coverage data...
(ERROR) Traceback (most recent call last):
  File "/home/jenkins/.cache/uv/archive-v0/tW48XpwEr9Rcgtyf6pT74/lib/python3.12/site-packages/gcovr/formats/gcov/workers.py", line 95, in worker
    work(*args, **kwargs)
  File "/home/jenkins/.cache/uv/archive-v0/tW48XpwEr9Rcgtyf6pT74/lib/python3.12/site-packages/gcovr/formats/gcov/read.py", line 466, in process_datafile
    done = run_gcov_and_process_files(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jenkins/.cache/uv/archive-v0/tW48XpwEr9Rcgtyf6pT74/lib/python3.12/site-packages/gcovr/formats/gcov/read.py", line 828, in run_gcov_and_process_files
    process_gcov_json_data(gcov_filename, covdata, options)
  File "/home/jenkins/.cache/uv/archive-v0/tW48XpwEr9Rcgtyf6pT74/lib/python3.12/site-packages/gcovr/formats/gcov/read.py", line 180, in process_gcov_json_data
    coverage = json.parse_coverage(
               ^^^^^^^^^^^^^^^^^^^^
  File "/home/jenkins/.cache/uv/archive-v0/tW48XpwEr9Rcgtyf6pT74/lib/python3.12/site-packages/gcovr/formats/gcov/parser/json.py", line 114, in parse_coverage
    filecov = _parse_file_node(
              ^^^^^^^^^^^^^^^^^
  File "/home/jenkins/.cache/uv/archive-v0/tW48XpwEr9Rcgtyf6pT74/lib/python3.12/site-packages/gcovr/formats/gcov/parser/json.py", line 227, in _parse_file_node
    filecov.insert_function_coverage(
  File "/home/jenkins/.cache/uv/archive-v0/tW48XpwEr9Rcgtyf6pT74/lib/python3.12/site-packages/gcovr/data_model/coverage.py", line 1617, in insert_function_coverage
    self.functions[key].merge(functioncov, options)
  File "/home/jenkins/.cache/uv/archive-v0/tW48XpwEr9Rcgtyf6pT74/lib/python3.12/site-packages/gcovr/data_model/coverage.py", line 990, in merge
    self.mangled_name = self._merge_property(
                        ^^^^^^^^^^^^^^^^^^^^^
  File "/home/jenkins/.cache/uv/archive-v0/tW48XpwEr9Rcgtyf6pT74/lib/python3.12/site-packages/gcovr/data_model/coverage.py", line 180, in _merge_property
    self.raise_merge_error(
  File "/home/jenkins/.cache/uv/archive-v0/tW48XpwEr9Rcgtyf6pT74/lib/python3.12/site-packages/gcovr/data_model/coverage.py", line 142, in raise_merge_error
    raise GcovrMergeAssertionError(
gcovr.data_model.coverage.GcovrMergeAssertionError: Function mangled name must be equal, got _ZN3ndn4util14IndentedStreamC1ERSoSt17basic_string_viewIcSt11char_traitsIcEE and _ZN3ndn4util14IndentedStreamC2ERSoSt17basic_string_viewIcSt11char_traitsIcEE.
GCOV data file of merge source is:
   /home/jenkins/workspace/ndn-cxx/OS/code-coverage/indented-stream.cpp.2##0ecd6002d2762306549a1ce06cb21a16.gcov.json.gz
and of merge target is:
   /home/jenkins/workspace/ndn-cxx/OS/code-coverage/indented-stream.cpp.2##0ecd6002d2762306549a1ce06cb21a16.gcov.json.gz

Pesa avatar Apr 02 '25 19:04 Pesa

I don't know why this wasn't failing in my previous test with commit de2a24fad81c5dbf9824c1649f97c67f0ae56739... The C++ code has not changed.

Pesa avatar Apr 02 '25 19:04 Pesa

I wonder what the gcovr implementation is doing here? Both mangled names are demangled to the same function name by c++filt (as expected), why isn't gcovr merging them based on their demangled name?

Pesa avatar Apr 02 '25 19:04 Pesa

Both seems to be constructors but we only normalise at the end: _ZN3ndn4util14IndentedStream C1 ERSoSt17basic_string_viewIcSt11char_traitsIcEE _ZN3ndn4util14IndentedStream C2 ERSoSt17basic_string_viewIcSt11char_traitsIcEE

Maybe we should use the mangled name from GCOV and always detangle the name ourself by using e.g. https://pypi.org/project/itanium-demangler

Spacetown avatar Apr 02 '25 19:04 Spacetown

The gcov json file (indented-stream.cpp.2##0ecd6002d2762306549a1ce06cb21a16.gcov.json.gz) already has the demangled_name, no? Why can't you use that? Are there cases where you do not want to merge based on the demangled name?

      "file": "../ndn-cxx/util/indented-stream.cpp",
      "functions": [
        {
          "name": "_ZN3ndn4util14IndentedStreamC1ERSoSt17basic_string_viewIcSt11char_traitsIcEE",
          "demangled_name": "ndn::util::IndentedStream::IndentedStream(std::ostream&, std::basic_string_view<char, std::char_traits<char> >)",
          "start_line": 32,
          "start_column": 1,
          "end_line": 36,
          "end_column": 1,
          "blocks": 7,
          "blocks_executed": 5,
          "execution_count": 10
        },
        {
          "name": "_ZN3ndn4util14IndentedStreamC2ERSoSt17basic_string_viewIcSt11char_traitsIcEE",
          "demangled_name": "ndn::util::IndentedStream::IndentedStream(std::ostream&, std::basic_string_view<char, std::char_traits<char> >)",
          "start_line": 32,
          "start_column": 1,
          "end_line": 36,
          "end_column": 1,
          "blocks": 4,
          "blocks_executed": 0,
          "execution_count": 0
        },

Pesa avatar Apr 02 '25 19:04 Pesa

That was the next idea. If we have a demangled name use the first mangled name when merged. If we do not have a demangled name we check that the mangled name is equal on merge.

To have a reliable result in multithreaded execution we always need to merge to the same mangled name if we have a demangled name defined.

Spacetown avatar Apr 02 '25 20:04 Spacetown

I checked the output of the Cobertura reporter and AFAICS constructors/destructors are properly merged now, and Jenkins doesn't complain about duplicate methods anymore. However it seems that they're being merged in the JSON report as well, is that intended?

Pesa avatar Apr 03 '25 20:04 Pesa

We're not using the JSON report so I'm totally fine either way (merging or not). I just wanted to double check that this was an intended behavior change.

Pesa avatar Apr 03 '25 20:04 Pesa

That was intended to merge them already in the data model but now that you're asking the lines are still present for all mangled names and now the function definition for _ZN3ndn4util14IndentedStreamC2ERSoSt17basic_string_viewIcSt11char_traitsIcEE is missing. :confused:

Spacetown avatar Apr 03 '25 20:04 Spacetown

@Pesa Do we need to merge the functions or is it also ok to add the mangled name to the signature to have unique entries.

Spacetown avatar Apr 18 '25 19:04 Spacetown

Are you asking about the json report? I don't have a strong preference. I suppose adding the mangled name and keeping them separate is the more "lossless" approach, and also more similar to gcov's behavior. But either way works for me.

Pesa avatar Apr 19 '25 05:04 Pesa

In data model and JSON I would revert and keep it separate. In Cobertura and HTML I'm working on merging the data but it's quite complicated to create a merged copy without altering the original data.

Spacetown avatar Apr 19 '25 10:04 Spacetown

Since there were no updates in 3 months: what's the fix in the meantime? Would downgrading to some non-latest version help?

mikedld avatar Jul 22 '25 11:07 mikedld

@mikedld I'm still working on it. If you've problems this problem you can downgrade to 7.2. Is it possible for you to test the PR branch after I push my current updates?

Spacetown avatar Jul 22 '25 20:07 Spacetown

Sure, ready to test, time permitting. FYI with GCC 15.1, while the latest release (8.3) fails when Jenkins tries to import the report:

java.lang.IllegalArgumentException: There is already the same child [METHOD] IFoo::~IFoo() <0> with the name IFoo::~IFoo() in [CLASS] IFoo_h <1, LINE: 100.00% (1/1)>

current main (gcovr-8.3.post1.dev33+ge92c4b03) fails before that on processing:

(INFO) Reading coverage data...
(ERROR) Traceback (most recent call last):
  File "…/gcovr/formats/gcov/workers.py", line 95, in worker
    work(*args, **kwargs)
    ~~~~^^^^^^^^^^^^^^^^^
  File "…/gcovr/formats/gcov/read.py", line 472, in process_datafile
    done = run_gcov_and_process_files(
        abs_filename,
    ...<3 lines>...
        chdir=wd,
    )
  File "…/gcovr/formats/gcov/read.py", line 843, in run_gcov_and_process_files
    process_gcov_json_data(gcov_filename, covdata, options)
    ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "…/gcovr/formats/gcov/read.py", line 210, in process_gcov_json_data
    covdata.insert_file_coverage(filecov, merge_options)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
  File "…/gcovr/data_model/container.py", line 198, in insert_file_coverage
    self.data[key].merge(filecov, options)
    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "…/gcovr/data_model/coverage.py", line 1689, in merge
    self.lines.merge(other.lines, options)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "…/gcovr/data_model/coverage_dict.py", line 58, in merge
    self[key].merge(item, options)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
  File "…/gcovr/data_model/coverage.py", line 1118, in merge
    self.calls.merge(other.calls, options)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "…/gcovr/data_model/coverage_dict.py", line 58, in merge
    self[key].merge(item, options)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
  File "…/gcovr/data_model/coverage.py", line 897, in merge
    self.source_block_id = self._merge_property(
                           ~~~~~~~~~~~~~~~~~~~~^
        other, "Source block ID", lambda x: x.source_block_id
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "…/gcovr/data_model/coverage.py", line 179, in _merge_property
    self.raise_merge_error(
    ~~~~~~~~~~~~~~~~~~~~~~^
        f"{msg} must be equal, got {left} and {right}.",
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        other,
        ^^^^^^
    )
    ^
  File "…/gcovr/data_model/coverage.py", line 140, in raise_merge_error
    raise GcovrMergeAssertionError(
    ...<11 lines>...
    )
gcovr.data_model.coverage.GcovrMergeAssertionError: …/Foo.h:127 (call 11) Source block ID must be equal, got 14 and 20.
GCOV data file of merge source is:
   …@tmp/_build/…/Bar.cpp##4da5a4bb1febbbf80c5406e899332d28.gcov.json.gz
and of merge target is:
   …@tmp/_build/…/FooTest.cpp##3e8f2d26c3fb46cc4ebbb110765b5054.gcov.json.gz

(ERROR) Error occurred while reading reports:
Traceback (most recent call last):
  File "…/gcovr/__main__.py", line 384, in main
    covdata = gcovr_formats.read_reports(options)
  File "…/gcovr/formats/__init__.py", line 98, in read_reports
    covdata = GcovHandler(options).read_report()
  File "…/gcovr/formats/gcov/__init__.py", line 231, in read_report
    return read_report(self.options)
  File "…/gcovr/formats/gcov/read.py", line 93, in read_report
    contexts = pool.wait()
  File "…/gcovr/formats/gcov/workers.py", line 181, in wait
    raise RuntimeError(
        "Worker thread raised exception, workers canceled."
    ) from None
RuntimeError: Worker thread raised exception, workers canceled.

Downgrading to 7.2 does indeed work.

mikedld avatar Jul 22 '25 22:07 mikedld

@mikedld Please can you try pip install git+https://github.com/Spacetown/gcovr.git@merge_different_branches_and_conditions?

Spacetown avatar Jul 23 '25 15:07 Spacetown

Since there were no updates in 3 months: what's the fix in the meantime? Would downgrading to some non-latest version help?

We've pinned our CI to commit 99b82e7 which fixes the problem with constructors, destructors, and const overloads for us.

Pesa avatar Jul 23 '25 18:07 Pesa

@Pesa Would it be possible for you to try the branch and give a feedback?

Spacetown avatar Jul 23 '25 20:07 Spacetown

@mikedld Please can you try pip install git+https://github.com/Spacetown/gcovr.git@merge_different_branches_and_conditions?

It succeeds, which is great. The statistics reported are quite different from the previous version though:

gcc 13, gcovr 7.2 gcc 15, gcovr 7.2 gcc 15, gcovr git
module coverage 100,00% 100,00% 100,00%
package coverage 87,88% 87,88% 87,88%
file coverage 67,72% 67,29% 52,95%
class coverage 67,72% 67,29% 52,95%
method coverage - - 60,59%
line coverage 37,48% 37,26% 44,22%
branch coverage 28,97% 29,10% 31,74%
lines of code 16990 17058 22140
number of files 378 (total)
237 (.cpp)
141 (.h)
373 (total)
237 (.cpp)
136 (.h)
474 (total)
237 (.cpp)
237 (.h)

Unfortunately I don't have the raw report files to compare directly, LMK if you want me to look into that. What I notice is that despite the apparent coverage increase, with git version Jenkins [mistakenly] shows a lot of places as not covered while with 7.2 they're [correctly] shown as covered. There're also cases where Jenkins would show exactly the same set of (un)covered lines for both git version and 7.2, e.g. a small header with a class and just a single covered line (defaulted dtor), yet would show different stats, e.g. 50% (git) vs. 100% (7.2).

mikedld avatar Jul 23 '25 23:07 mikedld

Are you using the Cobertura report for Uploading? Can you provide the files or compare on your own why there is such a huge difference in the number of header files? Can you provide an example where the lines are showed as uncovered?

Spacetown avatar Jul 24 '25 16:07 Spacetown

Attaching two anonymized cobertura reports: "old" (7.2) and "new" (git). The only difference is the gcovr version the rest is exactly the same (gcc 15).

anonymize.py
#!/usr/bin/env python3

import sys
from hashlib import sha256
from pathlib import Path

from lxml import etree


def anonymize(text):
    return sha256(text.encode("utf-8")).hexdigest()[:8]


def main(path):
    root = etree.fromstring(path.read_bytes())

    for action, elem in etree.iterwalk(root):
        if elem.tag == "source":
            elem.text = "/".join([
                anonymize(x) if x not in ("", "..") else x
                for x in elem.text.split("/")
            ])
        elif elem.tag == "package":
            elem.attrib["name"] = ".".join([
                "p" + anonymize(x)
                for x in elem.attrib["name"].split(".")
            ])
        elif elem.tag == "class":
            name = elem.attrib["name"].rsplit("_", 1)
            name[0] = "c" + anonymize(name[0])
            elem.attrib["name"] = "_".join(name)
            fname = elem.attrib["filename"].rsplit(".", 1)
            fname[0] = "/".join([anonymize(x) for x in fname[0].split("/")])
            elem.attrib["filename"] = ".".join(fname)
        elif elem.tag == "method":
            elem.attrib["name"] = "m" + anonymize(elem.attrib["name"])
            if elem.attrib["signature"] != "()":
                elem.attrib["signature"] = "(" + anonymize(elem.attrib["signature"]) + ")"

    path.with_suffix(".out.xml").write_bytes(etree.tostring(root))

if __name__ == "__main__":
    main(Path(sys.argv[1]))

mikedld avatar Jul 25 '25 15:07 mikedld