Previous merge behaviour did not fully merge coverage files when multiple covergroups existed under a single <covergroupCoverage>.

Consider the following XML file generated when collecting coverage

cov.xml

<!-- cov.xml -->
<UCIS xmlns:ucis="http://www.w3.org/2001/XMLSchema-instance" writtenBy="kasper" writtenTime="2023-09-19T00:00:00" ucisVersion="1.0">
  <sourceFiles fileName="__null__file__" id="1"/>
  <sourceFiles fileName="&lt;unknown&gt;" id="2"/>
  <sourceFiles fileName="/home/kasper/cocotb/gen_coverage.py" id="3"/>
  <historyNodes historyNodeId="0" logicalName="logicalName" physicalName="foo.ucis" kind="HistoryNodeKind.TEST" testStatus="true" simtime="0.0" timeunit="ns" runCwd="." cpuTime="0.0" seed="0" cmd="" args="" compulsory="0" date="2023-09-19T16:42:34" userName="user" cost="0.0" toolCategory="UCIS:simulator" ucisVersion="1.0" vendorId="unknown" vendorTool="unknown" vendorToolVersion="unknown"/>
  <instanceCoverages name="cg_inst" key="0">
    <id file="1" line="1" inlineCount="1"/>
    <covergroupCoverage>
      <cgInstance name="uvm_test_top.env.master.coverage.cg_addr" key="0">
        <options/>
        <cgId cgName="uvm_test_top.env.master.coverage.cg_addr" moduleName="uvm_test_top.env.master.coverage.cg_addr">
          <cginstSourceId file="1" line="1" inlineCount="1"/>
          <cgSourceId file="1" line="1" inlineCount="1"/>
        </cgId>
        <coverpoint name="addr" key="0">
          <options/>
          <coverpointBin name="low" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="36"/>
            </range>
          </coverpointBin>
          <coverpointBin name="med" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="30"/>
            </range>
          </coverpointBin>
          <coverpointBin name="high" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="34"/>
            </range>
          </coverpointBin>
        </coverpoint>
      </cgInstance>
      <cgInstance name="uvm_test_top.env.sdt_slave.coverage.cg_addr" key="0">
        <options/>
        <cgId cgName="uvm_test_top.env.master.coverage.cg_addr" moduleName="uvm_test_top.env.master.coverage.cg_addr">
          <cginstSourceId file="1" line="1" inlineCount="1"/>
          <cgSourceId file="1" line="1" inlineCount="1"/>
        </cgId>
        <coverpoint name="addr" key="0">
          <options/>
          <coverpointBin name="low" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="36"/>
            </range>
          </coverpointBin>
          <coverpointBin name="med" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="30"/>
            </range>
          </coverpointBin>
          <coverpointBin name="high" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="34"/>
            </range>
          </coverpointBin>
        </coverpoint>
      </cgInstance>
    </covergroupCoverage>
    <covergroupCoverage>
      <cgInstance name="uvm_test_top.env.master.coverage.cg_delays" key="0">
        <options/>
        <cgId cgName="uvm_test_top.env.master.coverage.cg_delays" moduleName="uvm_test_top.env.master.coverage.cg_delays">
          <cginstSourceId file="1" line="1" inlineCount="1"/>
          <cgSourceId file="1" line="1" inlineCount="1"/>
        </cgId>
        <coverpoint name="delay_cycles" key="0">
          <options/>
          <coverpointBin name="0" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="0"/>
            </range>
          </coverpointBin>
          <coverpointBin name="[1,5]" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="77"/>
            </range>
          </coverpointBin>
          <coverpointBin name="[6,15]" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="23"/>
            </range>
          </coverpointBin>
          <coverpointBin name="&gt;15" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="0"/>
            </range>
          </coverpointBin>
        </coverpoint>
      </cgInstance>
      <cgInstance name="uvm_test_top.env.sdt_slave.coverage.cg_delays" key="0">
        <options/>
        <cgId cgName="uvm_test_top.env.master.coverage.cg_delays" moduleName="uvm_test_top.env.master.coverage.cg_delays">
          <cginstSourceId file="1" line="1" inlineCount="1"/>
          <cgSourceId file="1" line="1" inlineCount="1"/>
        </cgId>
        <coverpoint name="delay_cycles" key="0">
          <options/>
          <coverpointBin name="0" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="0"/>
            </range>
          </coverpointBin>
          <coverpointBin name="[1,5]" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="77"/>
            </range>
          </coverpointBin>
          <coverpointBin name="[6,15]" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="23"/>
            </range>
          </coverpointBin>
          <coverpointBin name="&gt;15" type="bins" key="0">
            <range from="-1" to="-1">
              <contents coverageCount="0"/>
            </range>
          </coverpointBin>
        </coverpoint>
      </cgInstance>
    </covergroupCoverage>
  </instanceCoverages>
</UCIS>

The XML file contains 2x <covergroupCoverage>, each of which contains two <cgInstance>. When viewing with pyucis_viewer, the covergroupCoverage sections are labeled "default" since multiple covergroups exist beneath them.

When merging this coverage file with another coverage file from a similar run, the coverage merger only identifies one covergroup name (default), and thus doesn't include all nested covergroups. In the following, see that only the last covergroup from each section is included (cg_delays)

This PR fixes this by including all covergroups under a heading, not just the final one discovered (the one coming last in an alphabetical sorting). In the end, all covergroups are placed under a single "default" heading.

An alternative to this PR would be to modify the underlying database structure such that each covergroupCoverage is only associated with one cgInstance, but this might break spec conformance. Alternatively, a userAttr could be added to each covergroupCoverage to name the covergroupCoverage since adding a name entry directly in the XML field does not conform to spec.

Sep 19 '23 15:09 KasperHesse

Hi @KasperHesse, thanks for reaching out with this report. I agree that the results you see aren't expected. The solution might require more discussion.

I've been looking back over the UCIS spec, and looking at the UCIS XML reader code. My understanding is:

The 'name' attribute on on the cgInstance entity captures the instance name of the covergroup. In your examples, the instance names of the first covergroup type are uvm_test_top.env.master.coverage.cg_addr and uvm_test_top.env.sdt_slave.coverage.cg_addr
The 'cgName' attribute of the cgId sub-entity captures the type name of the covergroup. In your example, the first covergroup type is uvm_test_top.env.master.coverage.cg_addr. Both cgInstance entities have this same type name.

I would expect the UCIS Viewer to display the type name (eg uvm_test_top.env.master.coverage.cg_addr) instead of 'default'. Generally speaking, 'default' is shown when some optional name/identifier isn't present in the XML.

Based on looking at your example, I would expect to see: TYPE: uvm_test_top.env.master.coverage.cg_addr

INST: uvm_test_top.env.master.coverage.cg_addr
INST: uvm_test_top.env.sdt_slave.coverage.cg_addr TYPE: uvm_test_top.env.master.coverage.cg_delays
INST: uvm_test_top.env.master.coverage.cg_delays
INST: uvm_test_top.env.sdt_slave.coverage.cg_delays

Does this look as you expect?

~Matthew

Sep 20 '23 00:09 mballance

Hi Matthew, I think your proposal is also sound. However, your proposed example mixes coverage from the sdt_slave and the master components under the same heading, which I think could be confusing. In that case, it would be better if we could just extract the name of the covergroup, such that we get a structure as follows:

TYPE: cg_addr (or cov_collector.cg_addr)
- INST: uvm_test_top.env.master.coverage.cg_addr
- INST: uvm_test_top.env.sdt_slave.coverage.cg_addr
TYPE: cg_delays (or cov_collector.cg_delays)
- INST: uvm_test_top.env.master.coverage.cg_delays
- INST: uvm_test_top.env.sdt_slave.coverage.cg_delays

Obviously this would require that we don't have duplicate covergroup names anywhere in the testbench. If that were the case, it would also be necessary to identify the name of the class in which the covergroup is instantiated, as shown in brackets.

An easy fix would be to replace the following code in xml_reader.py https://github.com/fvutils/pyucis/blob/2bc9afd4ce2755cde28d0246997c1accecba0769/src/ucis/xml/xml_reader.py#L152-L160 with

cg_typescope = inst_scope.createCovergroup(
  self.getAttr(instances[0], "name", "default").split(".")[-1],
  None,
  1,
  UCIS_OTHER
  )

To get the name of the covergroup. See below. This correctly merges covergroups with the current merging-code, but could inadvertently merge duplicate covergroup names from unrelated coverage collectors. Thoughts?

Sep 20 '23 12:09 KasperHesse

Hi Kasper, Ok, I'm starting to get the picture. Thanks for the additional explanation. Now, to get this result, we would need to treat elements of the UCIS schema differently from how I understand their intended use. Specifically, my understanding is that cgInstance::name is the instance name of the covergroup, while cgInstance::cgId::name is the type name of the covergroup. From the explanation above, it seems you're assuming the reverse (ie the leaf name of cgInstance::name is the type name, while cgInstance::cgId::name is the instance name). How was the .xml file produced? Was it produced by running PyVSC, a commercial tool, another open source tool, by hand, etc?

Thanks, Matthew

Sep 20 '23 14:09 mballance

Hi Matthew, The XML file was produced using pyVSC. I manually cleaned it up a bit by removing some of our additional coverpoints for clarity, but I don't believe I've made any changes to the structure of the file. I'll go over it again at a later point, just to make sure I haven't inadvertently messed something up.

Sep 21 '23 07:09 KasperHesse

Hi Kasper, That's good news, since any issues (eg instance vs type names) are self-contained. If you're able to share the covergroup structure that created this XML, please do. I'll do a little investigation as well.

Sep 22 '23 14:09 mballance

Hi Matthew, Sorry about the delay in getting back to you. I've put together a somewhat-minimal MWE so we can get better comparisons. Below are outlined two experiments I did and my findings.

Experiment 1

Here, I'm using the following Python code to generate a coverage report. I'm defining two covergroups, and instantiating two of each. I generate some dummy data that is sampled before generating the XML coverage report.

Python code to generate coverage

import vsc
import random

@vsc.covergroup
class first_covergroup(object):
    def __init__(self, name):
        self.options.name = name
        # self.options.per_instance = True

        self.with_sample(
            rdwr = vsc.bit_t(1),
            addr = vsc.bit_t(8),
            data = vsc.bit_t(8)
        )

        self.cp_rdwr = vsc.coverpoint(self.rdwr, bins={
            "RD" :  vsc.bin(0),
            "WR" :  vsc.bin(1)
        })

        self.cp_addr = vsc.coverpoint(self.addr, bins={
            "low"  :  vsc.bin([0, 84]),
            "med"  :  vsc.bin([85, 169]),
            "high" :  vsc.bin([170, 255])
        })

        self.cp_data = vsc.coverpoint(self.data, bins={
            "low"  :  vsc.bin([0, 84]),
            "med"  :  vsc.bin([85, 169]),
            "high" :  vsc.bin([170, 255])
        })

        self.cp_cross = vsc.cross(
            [self.cp_rdwr, self.cp_addr, self.cp_data]
        )

@vsc.covergroup
class second_covergroup(object):
    def __init__(self, name):
        self.options.name = name
        # self.options.per_instance = True

        self.with_sample(
            addr = vsc.bit_t(8),
            data = vsc.bit_t(8)
        )

        self.cp_addr = vsc.coverpoint(self.addr, bins={
            "low"  :  vsc.bin([0, 84]),
            "med"  :  vsc.bin([85, 169]),
            "high" :  vsc.bin([170, 255])
        })

        self.cp_data = vsc.coverpoint(self.data, bins={
            "low"  :  vsc.bin([0, 84]),
            "med"  :  vsc.bin([85, 169]),
            "high" :  vsc.bin([170, 255])
        })

if __name__ == "__main__":
   # Create two of each covergroup, generate 10 random samples for each covergroup
   fcg_1 = first_covergroup("first_covergroup_1")
   fcg_2 = first_covergroup("first_covergroup_2")
   scg_1 = second_covergroup("second_covergroup_1")
   scg_2 = second_covergroup("second_covergroup_2")

   fcg_1_rdwr = [random.randint(0,1) for _ in range(10)]
   fcg_1_addr = [random.randint(0,255) for _ in range(10)]
   fcg_1_data = [random.randint(0,255) for _ in range(10)]

   fcg_2_rdwr = [random.randint(0,1) for _ in range(10)]
   fcg_2_addr = [random.randint(0,255) for _ in range(10)]
   fcg_2_data = [random.randint(0,255) for _ in range(10)]

   scg_1_addr = [random.randint(0,255) for _ in range(10)]
   scg_1_data = [random.randint(0,255) for _ in range(10)]

   scg_2_addr = [random.randint(0,255) for _ in range(10)]
   scg_2_data = [random.randint(0,255) for _ in range(10)]

   for i in range(10):
    fcg_1.sample(fcg_1_rdwr[i], fcg_1_addr[i], fcg_1_data[i])
    fcg_2.sample(fcg_2_rdwr[i], fcg_2_addr[i], fcg_2_data[i])
    scg_1.sample(scg_1_addr[i], scg_1_data[i])
    scg_2.sample(scg_2_addr[i], scg_2_data[i])

   vsc.write_coverage_db(f"cov.xml")

The coverage is attached as cov.txt. (GitHub doesn't allow me to upload XML files, so the file extension had to be changed ...) This generates the following, when opened in PyUCIS Viewer coverage_sampled

I've used PyUCIS to merge the coverage DB with itself (attached as cov_merged_orig.txt). Merging the coverage DB using the original code generates the following: coverage_merged_orig

Merging the coverage report using my "fix" generates the following (attached as cov_merged_mod.txt) coverage_merged_modified

cov.txt cov_merged_mod.txt cov_merged_orig.txt

As you can see, with my modification, the coverage is correctly merged. However, the TYPE label is still "default", likely because a hierarchical naming scheme isn't used. Since we're using PyUVM and hierarchical naming in our project, that also warrants attention, leading us to:

Experiment 2

Now, a hierarchical system of coverage collectors is used. The code is as follows

Python code for experiment 2

The definitions of the covergroups haven't been changed.

import vsc
import random

@vsc.covergroup
class first_covergroup(object):
    def __init__(self, name):
        self.options.name = name
        # self.options.per_instance = True

        self.with_sample(
            rdwr = vsc.bit_t(1),
            addr = vsc.bit_t(8),
            data = vsc.bit_t(8)
        )

        self.cp_rdwr = vsc.coverpoint(self.rdwr, bins={
            "RD" :  vsc.bin(0),
            "WR" :  vsc.bin(1)
        })

        self.cp_addr = vsc.coverpoint(self.addr, bins={
            "low"  :  vsc.bin([0, 84]),
            "med"  :  vsc.bin([85, 169]),
            "high" :  vsc.bin([170, 255])
        })

        self.cp_data = vsc.coverpoint(self.data, bins={
            "low"  :  vsc.bin([0, 84]),
            "med"  :  vsc.bin([85, 169]),
            "high" :  vsc.bin([170, 255])
        })

        self.cp_cross = vsc.cross(
            [self.cp_rdwr, self.cp_addr, self.cp_data]
        )

@vsc.covergroup
class second_covergroup(object):
    def __init__(self, name):
        self.options.name = name
        # self.options.per_instance = True

        self.with_sample(
            addr = vsc.bit_t(8),
            data = vsc.bit_t(8)
        )

        self.cp_addr = vsc.coverpoint(self.addr, bins={
            "low"  :  vsc.bin([0, 84]),
            "med"  :  vsc.bin([85, 169]),
            "high" :  vsc.bin([170, 255])
        })

        self.cp_data = vsc.coverpoint(self.data, bins={
            "low"  :  vsc.bin([0, 84]),
            "med"  :  vsc.bin([85, 169]),
            "high" :  vsc.bin([170, 255])
        })


class cov_collector():
  def __init__(self, name):
    self.name = name
    self.first = first_covergroup(f"{name}.first_covergroup")
    self.second = second_covergroup(f"{name}.second_covergroup")

  def do_cov(self):
    first_rdwr =  [random.randint(0,1) for _ in range(10)]
    first_addr =  [random.randint(0,255) for _ in range(10)]
    first_data =  [random.randint(0,255) for _ in range(10)]
    second_addr = [random.randint(0,255) for _ in range(10)]
    second_data = [random.randint(0,255) for _ in range(10)]

    for i in range(0,10):
      self.first.sample(first_rdwr[i], first_addr[i], first_data[i])
      self.second.sample(second_addr[i], second_data[i])


if __name__ == "__main__":
  cov_mst = cov_collector("mst")
  cov_slv = cov_collector("slv")
  cov_mst.do_cov()
  cov_slv.do_cov()
  vsc.write_coverage_db(f"cov_hier.xml")

This leads to the following coverage DB files and the following visualisations in PyUCIS Viewer. Again, the merged coverage DB's have been generated by merging the generated coverage with itself.

Generated coverage cov_hier.txt

Merged coverage, original code cov_hier_merged_orig.txt

Merged coverage, my modification cov_hier_merged_mod.txt

Takeaway

I believe my proposed modification to xml_reader.py could serve to fix some of the problems when merging coverage databases, as it doesn't drop everything but the last covergroup. It does, however, depend on the user using "." as a hierarchical separator to correctly set the TYPE label. I believe this is a valid assumption, but I realize it's not perfect.

If "." is not used as the separator, the TYPE labels will be somewhat misleading, but the merge will still happen correctly (pictured below with "," as separator. While the TYPE label is "mst,first_covergroup", the correctly merged and placed instances of "first_covergroup" under the same TYPE label) cov_comma

Sep 26 '23 07:09 KasperHesse

Fix incomplete merge with multiple covergroups in a covergroupCoverage

Experiment 1

Experiment 2

Takeaway