coveragepy icon indicating copy to clipboard operation
coveragepy copied to clipboard

Automatically compute path remapping rules in "coverage combine"

Open zackw opened this issue 1 year ago • 2 comments

Is your feature request related to a problem? Please describe.

The [paths] section of a .coveragerc is a fragile beast. By way of example, if I'm testing my Python package in Github Actions CI on MacOS, Ubuntu, and Windows using both pip and conda, and I want to merge coverage reports from all of them, then I need to have a [paths] section like this:

[paths]
merge =
    <PKG>/
    /Users/runner/miniconda3/envs/test/lib/python3.<V>/site-packages/<PKG>/
    /Users/runner/work/<PKG>/<PKG>/venv/lib/python3.<V>/site-packages/<PKG>/
    /home/runner/work/<PKG>/<PKG>/venv/lib/python3.<V>/site-packages/<PKG>/
    /usr/share/miniconda/envs/test/lib/python3.<V>/site-packages/<PKG>/
    C:\Miniconda\envs\test\Lib\site-packages\<PKG>\
    D:\a\<PKG>\<PKG>\venv\Lib\site-packages\<PKG>\

where <V> is the Python minor version (each path with <V> in it has to be duplicated for each version of Python I'm testing with) and <PKG> is the name of my package (there has to be a whole 'nother merge group for each package simultaneously under test).

All these absolute paths are implementation details of the CI runner that could change without any warning. I shouldn't have to know them at all, let alone write them into a file that's checked into version control.

Describe the solution you'd like Instead of having the user determine and then write down a [paths] stanza, coverage combine should be able to compute all the necessary path remapping rules itself, for example like this:

def remap_paths_for_databases(cfg, databases):
    from coverage import CoverageData
    from collections import defaultdict
    from os.path import commonprefix
    from pathlib import PurePosixPath, PureWindowsPath

    prefixes = set()
    for db_fname in databases:
        db = CoverageData(basename=db_fname)
        db.read()
        prefixes.add(commonprefix(list(db.measured_files())))

    packages = defaultdict(set)
    for p in prefixes:
        if '\\' in p or (len(p) >= 2 and p[0].isalpha() and p[1] == ':'):
            name = PureWindowsPath(p).name
        else:
            name = PurePosixPath(p).name
        packages[name].add(p)

    pkg_names = sorted(packages.keys())

    cfg["run"]["relative_files"] = "true"
    cfg["run"]["source_pkgs"] = " ".join(pkg_names)

    cfg["paths"] = {}
    for pkg in pkg_names:
        pkg_paths = ['', pkg + '/']
        pkg_paths.extend(sorted(packages[pkg]))

The cfg argument to this function is the ConfigParser instance for the .coveragerc, and the databases argument is a list of all the coverage databases to be combined.

Describe alternatives you've considered

One alternative would be to make relative_files mode more aggressive; it could trim out all path components above an element of sources / source_pkgs, and canonicalize path separators to /, before writing the database in the first place. That would make the subsequent combine step not need to remap paths.

Ideally, I would like to see both of the above two options implemented and maybe even on by default.

zackw avatar Aug 28 '24 18:08 zackw