lcov lcov --initial generates mis-hit for signature in function definition

Versions

Running LCOV version 1.12 on macOS Sierra 10.12.5 and g++ Apple LLVM version 8.1.0 (clang-802.0.42)

Code snippet

Consider the following simple c++ code:

#include <iostream>

int myFn(int x) {
  return x * x;
}

int main(int argc, char *argv[]) {
  --argc, ++argv;

  std::cout << myFn(12) << std::endl;
  return 0;
}

Coverage result

   Line data    Source code
 1             : #include <iostream>
 2             :
 3           0 : int myFn(int x) {
 4           1 :   return x * x;
 5             : }
 6             :
 7           0 : int main(int argc, char *argv[]) { 
 8           1 :   --argc, ++argv;
 9             :
10           1 :   std::cout << myFn(12) << std::endl;
11           1 :   return 0;
12             : }

PROBLEM: lines 3 and 7 are reported executable but not hit, leading to mis coverage report.

Steps to reproduce

echo "--------------"
echo "-- Building --"
g++ --coverage -O0 -g0 test.cpp

echo "----------------------"
echo "-- Zeroing counters --"
lcov --zerocounters -d ./

echo "---------------------"
echo "-- Initial capture --"
lcov --capture --no-external --initial -d ./ --gcov-tool covwrap.sh -o base.info

echo "-------------"
echo "-- Running --"
./a.out

echo "---------------"
echo "-- Capturing --"
lcov --capture -d ./ --gcov-tool covwrap.sh -o test.info

echo "---------------"
echo "-- Combining --"
lcov -a base.info -a test.info -o total.info

echo "----------"
echo "-- HTML --"
genhtml total.info -o ./html

(where covwrap.sh is simply:

#!/bin/bash
exec llvm-cov-mp-3.9 gcov "$@"

Output

--------------
-- Building --
----------------------
-- Zeroing counters --
Deleting all .da files in ./ and subdirectories
Done.
---------------------
-- Initial capture --
Capturing coverage data from ./
Found gcov version: 3.9.1
Found LLVM gcov version 3.4, which emulates gcov version 4.2
Scanning ./ for .gcno files ...
Found 1 graph files in ./
Processing test.gcno
  ignoring data for external file /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/__locale
  ignoring data for external file /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/ios
  ignoring data for external file /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/ostream
Finished .info-file creation
-------------
-- Running --
144
---------------
-- Capturing --
Capturing coverage data from ./
Found gcov version: 3.9.1
Found LLVM gcov version 3.4, which emulates gcov version 4.2
Scanning ./ for .gcda files ...
Found 1 data files in ./
Processing test.gcda
Finished .info-file creation
---------------
-- Combining --
Combining tracefiles.
Reading tracefile base.info
Reading tracefile test.info
Writing data to total.info
Summary coverage rate:
  lines......: 78.6% (11 of 14 lines)
  functions..: 100.0% (7 of 7 functions)
  branches...: no data found
----------
-- HTML --
Reading data file total.info
Found 4 entries.
Found common filename prefix "/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++"
Writing .css and .png files.
Generating output.
Processing file v1/ostream
Processing file v1/__locale
Processing file v1/ios
Processing file /bb/tmp/test.cpp
Writing directory view page.
Overall coverage rate:
  lines......: 78.6% (11 of 14 lines)
  functions..: 100.0% (7 of 7 functions)

May 24 '17 03:05 jbibollet

I tried to reproduce the problem with GNU g++ 4.8.5 but the result was as expected: both test.info and base.info contain an execution count of 1 for lines 3 and 7. Could you run llvm-cov directly (via covwrap.sh -abc test.gcda) on the test.gcda file and post the relevant portion of the resulting test.cpp.gcov file?

May 24 '17 14:05 oberpar

The resulting test.cpp.gcov does not flag lines 3 and 7 as being instrumented/hit:

        -:    0:Source:test.cpp
        -:    0:Graph:test.gcno
        -:    0:Data:test.gcda
        -:    0:Runs:1
        -:    0:Programs:1
        -:    1:#include <iostream>
        -:    2:
function _Z4myFni called 1 returned 100% blocks executed 100%
        -:    3:int myFn(int x) {
        1:    4:  return x * x;
        1:    4-block  0
        -:    5:}
        -:    6:
function main called 1 returned 100% blocks executed 100%
        -:    7:int main(int argc, char *argv[]) {
        1:    8:  --argc, ++argv;
        -:    9:
        1:   10:  std::cout << myFn(12) << std::endl;
        1:   11:  return 0;
        1:   11-block  0
        -:   12:}

Full output of the llvm-cov -abv test.gcda:

covwrap.sh -abc test.gcda
File 'test.cpp'
Lines executed:100.00% of 4
No branches
No calls
test.cpp:creating 'test.cpp.gcov'

File '/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/ostream'
Lines executed:100.00% of 4
No branches
No calls
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/ostream:creating 'ostream.gcov'

File '/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/ios'
Lines executed:50.00% of 2
Branches executed:66.67% of 6
Taken at least once:33.33% of 6
No calls
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/ios:creating 'ios.gcov'

File '/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/__locale'
Lines executed:100.00% of 2
No branches
No calls
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/__locale:creating '__locale.gcov'

May 24 '17 16:05 jbibollet

It appears that the cause for this problem lies in geninfo's read_gcno_function_record() function that assumes that a function's starting line is always instrumented. This seems to be true for .gcno files generated by GCC, but not for those from LLVM.

A fix needs some more thought though, as the internal data representation for a .gcno file currently doesn't have a dedicated place for reporting function starting lines.

Jun 01 '17 12:06 oberpar

Wondering if there is any workaround for this?

Initial capture indeed seems to assume that the function's starting line is instrumented, which then leads to false positives when they are marked as non-covered.

Edit: it seems as simple as filtering out all lines (where N is a line number) like

DA:N,0

where there exists a previous line in this file saying

FN:N,<function name>

Is there any catch? (re: properly fixing it in lcov itself)

Dec 24 '17 18:12 aldanor

Here's a sample script that "fixes" it, for those who need it working now:

import argparse
import subprocess

def demangle(symbol):
    return subprocess.check_output(['c++filt', '-n', symbol.strip()]).decode().strip()

def filter_lcov(lines, verbose=False):
    defs, srcfile = {}, ''
    for line in lines:
        if line.startswith('SF:'):
            defs = {}
            srcfile = line[3:].strip()
        elif line.startswith('end_of_record'):
            defs = {}
        elif line.startswith('FN:'):
            lineno, symbol = line[3:].split(',')
            defs[lineno] = demangle(symbol)
        elif line.startswith('DA:'):
            lineno = line[3:].split(',')[0]
            if lineno in defs:
                if verbose:
                    print(f'Ignoring: {srcfile}:{lineno}:{defs[lineno]}')
                continue
        yield line

def main():
    p = argparse.ArgumentParser()
    p.add_argument('input', type=str)
    p.add_argument('output', type=str)
    p.add_argument('--verbose', '-v', action='store_true')
    args = p.parse_args()
    with open(args.input, 'r') as fin:
        lines = list(fin)
    with open(args.output, 'w') as fout:
        for line in filter_lcov(lines, verbose=args.verbose):
            fout.write(line)

if __name__ == '__main__':
    main()

(It would be obviously nice to not have to do it manually, though...)

Dec 24 '17 19:12 aldanor

Here's a faster version of your script that works with Objective-C++ sources and only demangles in verbose mode :)

import argparse
import subprocess

def demangle(symbol):
    p = subprocess.Popen(['c++filt','-n'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
    return p.communicate(input=symbol.encode())[0].decode().strip()

def filter_lcov(lines, verbose=False):
    defs, srcfile = {}, ''
    for line in lines:
        if line.startswith('SF:'):
            defs = {}
            srcfile = line[3:].strip()
        elif line.startswith('end_of_record'):
            defs = {}
        elif line.startswith('FN:'):
            lineno, symbol = line[3:].split(',')
            if verbose:
                defs[lineno] = demangle(symbol)
            else:
                defs[lineno] = True
        elif line.startswith('DA:'):
            lineno = line[3:].split(',')[0]
            if lineno in defs:
                if verbose:
                    print(f'Ignoring: {srcfile}:{lineno}:{defs[lineno]}')
                continue
        yield line

def main():
    p = argparse.ArgumentParser()
    p.add_argument('input', type=str)
    p.add_argument('output', type=str)
    p.add_argument('--verbose', '-v', action='store_true')
    args = p.parse_args()
    with open(args.input, 'r') as fin:
        lines = list(fin)
    with open(args.output, 'w') as fout:
        for line in filter_lcov(lines, verbose=args.verbose):
            fout.write(line)

if __name__ == '__main__':
    main()

Jan 03 '18 18:01 JeremyAgost

Can you release your workaround script with an open-source license? I want to use it.

Jan 23 '18 23:01 Swift1313

@Swift1313 I published my version of the python script with a public domain license at https://github.com/JeremyAgost/lcov-llvm-function-mishit-filter

Jan 24 '18 08:01 JeremyAgost

@aldanor Considering @JeremyAgost's script is based on your's, can you open-source that as well? Please?

Jan 24 '18 21:01 Swift1313

My g++ (arch linux) outputs the following

Coverage Result

   Line data    Source code
 1             : #include <iostream>
 2             : 
 3           1 : int myFn(int x) {
 4           1 :   return x * x;
 5             : }
 6             : 
 7           1 : int main(int argc, char *argv[]) {
 8           1 :   --argc, ++argv;
 9             : 
10           1 :   std::cout << myFn(12) << std::endl;
11           1 :   return 0;
12           0 : }

For me the last closing bracket will not count

Jun 04 '18 20:06 marehr

Here's a one-liner that should do the same thing as the Python scripts. Awk isn't my forte, so there may be an easier method.

awk -F '[:,]' '/^SF:/ { delete defs } /^FN:/ { defs[$2]=1 } /^DA:/ { if ($3 == 0 && $2 in defs) next } { print }'

Explanation: -F '[:,]' sets the field separator FS to the regular expression [:,]. The defs array is deleted for every new source file. On each function definition, the line number is used as a key into defs (with an arbitrary value of 1). Each zero count for lines that are part of the definition is not printed whereas every other line is.

This is simple enough to place directly into a Makefile (but you do need to double the $ to pass them through make).

Sep 17 '18 02:09 stevecheckoway

Hola, is there no fix for this yet?

Mar 06 '19 10:03 popescu-af

I have NOT tried to test with upstream and see if it’s fixed.

On Mar 6, 2019, at 5:03 AM, popescu-af [email protected] wrote:

Hola, is there no fix for this yet?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/linux-test-project/lcov/issues/30#issuecomment-470046410, or mute the thread https://github.com/notifications/unsubscribe-auth/AA02PO1zziq4E6G4A1wNt54h-wwMweAeks5vT5J3gaJpZM4Nkixx.

Mar 07 '19 19:03 JeremyAgost

Hi - This issue seems to have been lingering, open, for a very long time. Does anyone know if it is still a problem? If so - and if there is a testcase - then I will try to fix it (in the lcov sources - no external scripts or hacking). Note that a there are a number of LLVM-11 related fixes in a recent TOT commit. If it has been fixed by now: then I would like to close it.

Thanks

Dec 13 '22 11:12 henry2cox

The problem mentioned in https://github.com/linux-test-project/lcov/issues/30#issuecomment-305481438 refers to lcov code that directly interprets binary .gcno files. This code is no longer used for toolchains that support the intermediate gcov output format such as LLVM 11/Xcode 12.5 and GCC versions 5 and above, therefore this problem should no longer be present when using those versions.

Given the amount of effort that would be required to fix the gcno parsing code and the existing workaround for affected users to switch to more current toolchains, I'm closing this issue as won't fix.

May 02 '23 09:05 oberpar