travis runs segfaults on binary built with clang++-3.7
Maybe this is an issue with how travis-ci builds stuff, and my setup.
https://travis-ci.org/rollbear/trompeloeil/jobs/99500002
If you follow the link above, and click "after_success" near the very bottom, you'll see how kcov is fetched, built and run.
I see oddly that kcov itself is built with clang++-3.4, which is rather antiquated. The binary it runs is built with clang++-3.7.
When run the program dies with SIGSEGV.
The neighbour build https://travis-ci.org/rollbear/trompeloeil/jobs/99500001 also builds kcov with clang++-3.4, but here the binary it runs is built with clang++-3.6, and all is fine.
I am not sure when this was introduced, but on Dec 11 it worked and now it doesn't.
Does running kcov with --skip-soibs help?
Solib handling uses a LD_PRELOAD:ed shared library that runs in the process context, and perhaps that could be affected by the compiler mix.
Nope, --skip-solibs does not help.
I also changed the build so that it compiles kcov with the same compiler it builds the test program with, still with the same result.
I can't reproduce this myself, after building the latest git trompeloeil with
CXXFLAGS="-ICatch-1.2.1-develop.12/single_include -std=c++14 -g" CXX=/home/ska/Downloads/clang+llvm-3.7.0-x86_64-fedora22/bin/clang++ make -f Makefile.travis
and running
kcov --include-pattern=trompeloeil.hpp /tmp/kcov ./self_test
I've tried kcovs built with GCC 5.3.1 and clang 3.5, with the same result. One thing you could try is to install the binutils-dev package and run kcov with --verify. This will catch broken DWARF debug info, where breakpoints point to the middle of an instruction.
However, that will mostly give SIGILLs, so it's probably not the case here.
Unfortunately I can't reproduce the problem myself either, but it consistently happens on every build in the travis-ci containers. I really don't know how to get the necessary information out of it to understand what's happening.
OK, this has now gotten weird for real. The problem now shows both with clang++ 3.6 and 3.7. The v12 tagged builds crashed with SIGSEGV on the kcov runs for both. I've added a rule to the travis build to dump core, and run gdb on kcov failure, showing the backtrace. It now runs all tests successfully to completion, and then fails. A core dump is found, and all gdb says is "No stack".
But are you sure it actually produced a core dump? I guess running gdb without one would give that result.
It might be that kcov incorrectly returns an error (?)
clang++ allows you to specify which version of dwarf debug info to produce (2, 3 or 4.) Would it make a difference for kcov which is generated?
By running kcov inside gdb in the travis build, I managed to capture this back trace. This is after the test program has successfully run to completion.
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7bb0d3a in dwarf_getsrclines () from /usr/lib/x86_64-linux-gnu/libdw.so.1
#0 0x00007ffff7bb0d3a in dwarf_getsrclines () from /usr/lib/x86_64-linux-gnu/libdw.so.1
#1 0x0000000000443ec9 in kcov::DwarfParser::forEachLine(kcov::IFileParser::ILineListener&) () at /home/travis/build/rollbear/trompeloeil/kcov/src/parsers/dwarf.cc:48
#2 0x0000000000442b04 in ElfInstance::parseOneDwarf(unsigned long) () at /home/travis/build/rollbear/trompeloeil/kcov/src/parsers/elf-parser.cc:420
#3 0x0000000000441aad in ElfInstance::doParse(unsigned long) () at /home/travis/build/rollbear/trompeloeil/kcov/src/parsers/elf-parser.cc:305
#4 0x00000000004416dc in ElfInstance::parse() () at /home/travis/build/rollbear/trompeloeil/kcov/src/parsers/elf-parser.cc:284
#5 0x0000000000415359 in Collector::run(std::string const&) () at /home/travis/build/rollbear/trompeloeil/kcov/src/collector.cc:48
#6 0x000000000042abcb in main ()
The stack trace looks the same with clang++ 3.6 and 3.7. With g++ 4.9 and 5, it does not crash.
I don't think the dwarf debug version should matter, but I'm by no means an expert in the matter.
Anyway, thanks for catching the crash! Since it's in libdw, I guess it's either a bug in libdw, or broken DWARF information generated by clang - or a combination thereof. It's slightly worrying that it only shows up in travis though, compilation should be the same in either case.
Late followup, but maybe this symbol demangler bug is relevant?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70909
I don't think so. Kcov uses binutils / libbfd only when --verify is passed, and this bug is reproducible without it as well.
Anyway, I suppose a way to get the crash out of travis is to use uuencode or similar on the core file and then analyze it outside.