kcov icon indicating copy to clipboard operation
kcov copied to clipboard

travis runs segfaults on binary built with clang++-3.7

Open rollbear opened this issue 10 years ago • 11 comments

Maybe this is an issue with how travis-ci builds stuff, and my setup.

https://travis-ci.org/rollbear/trompeloeil/jobs/99500002

If you follow the link above, and click "after_success" near the very bottom, you'll see how kcov is fetched, built and run.

I see oddly that kcov itself is built with clang++-3.4, which is rather antiquated. The binary it runs is built with clang++-3.7.

When run the program dies with SIGSEGV.

The neighbour build https://travis-ci.org/rollbear/trompeloeil/jobs/99500001 also builds kcov with clang++-3.4, but here the binary it runs is built with clang++-3.6, and all is fine.

I am not sure when this was introduced, but on Dec 11 it worked and now it doesn't.

rollbear avatar Dec 31 '15 03:12 rollbear

Does running kcov with --skip-soibs help?

Solib handling uses a LD_PRELOAD:ed shared library that runs in the process context, and perhaps that could be affected by the compiler mix.

SimonKagstrom avatar Dec 31 '15 06:12 SimonKagstrom

Nope, --skip-solibs does not help.

I also changed the build so that it compiles kcov with the same compiler it builds the test program with, still with the same result.

rollbear avatar Jan 02 '16 12:01 rollbear

I can't reproduce this myself, after building the latest git trompeloeil with

CXXFLAGS="-ICatch-1.2.1-develop.12/single_include -std=c++14 -g" CXX=/home/ska/Downloads/clang+llvm-3.7.0-x86_64-fedora22/bin/clang++ make -f Makefile.travis

and running

kcov --include-pattern=trompeloeil.hpp /tmp/kcov ./self_test 

I've tried kcovs built with GCC 5.3.1 and clang 3.5, with the same result. One thing you could try is to install the binutils-dev package and run kcov with --verify. This will catch broken DWARF debug info, where breakpoints point to the middle of an instruction.

However, that will mostly give SIGILLs, so it's probably not the case here.

SimonKagstrom avatar Jan 02 '16 12:01 SimonKagstrom

Unfortunately I can't reproduce the problem myself either, but it consistently happens on every build in the travis-ci containers. I really don't know how to get the necessary information out of it to understand what's happening.

rollbear avatar Jan 02 '16 13:01 rollbear

OK, this has now gotten weird for real. The problem now shows both with clang++ 3.6 and 3.7. The v12 tagged builds crashed with SIGSEGV on the kcov runs for both. I've added a rule to the travis build to dump core, and run gdb on kcov failure, showing the backtrace. It now runs all tests successfully to completion, and then fails. A core dump is found, and all gdb says is "No stack".

rollbear avatar Feb 02 '16 18:02 rollbear

But are you sure it actually produced a core dump? I guess running gdb without one would give that result.

It might be that kcov incorrectly returns an error (?)

SimonKagstrom avatar Feb 02 '16 19:02 SimonKagstrom

clang++ allows you to specify which version of dwarf debug info to produce (2, 3 or 4.) Would it make a difference for kcov which is generated?

rollbear avatar Feb 13 '16 10:02 rollbear

By running kcov inside gdb in the travis build, I managed to capture this back trace. This is after the test program has successfully run to completion.

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7bb0d3a in dwarf_getsrclines () from /usr/lib/x86_64-linux-gnu/libdw.so.1
#0  0x00007ffff7bb0d3a in dwarf_getsrclines () from /usr/lib/x86_64-linux-gnu/libdw.so.1
#1  0x0000000000443ec9 in kcov::DwarfParser::forEachLine(kcov::IFileParser::ILineListener&) () at /home/travis/build/rollbear/trompeloeil/kcov/src/parsers/dwarf.cc:48
#2  0x0000000000442b04 in ElfInstance::parseOneDwarf(unsigned long) () at /home/travis/build/rollbear/trompeloeil/kcov/src/parsers/elf-parser.cc:420
#3  0x0000000000441aad in ElfInstance::doParse(unsigned long) () at /home/travis/build/rollbear/trompeloeil/kcov/src/parsers/elf-parser.cc:305
#4  0x00000000004416dc in ElfInstance::parse() () at /home/travis/build/rollbear/trompeloeil/kcov/src/parsers/elf-parser.cc:284
#5  0x0000000000415359 in Collector::run(std::string const&) () at /home/travis/build/rollbear/trompeloeil/kcov/src/collector.cc:48
#6  0x000000000042abcb in main ()

The stack trace looks the same with clang++ 3.6 and 3.7. With g++ 4.9 and 5, it does not crash.

rollbear avatar Feb 13 '16 11:02 rollbear

I don't think the dwarf debug version should matter, but I'm by no means an expert in the matter.

Anyway, thanks for catching the crash! Since it's in libdw, I guess it's either a bug in libdw, or broken DWARF information generated by clang - or a combination thereof. It's slightly worrying that it only shows up in travis though, compilation should be the same in either case.

SimonKagstrom avatar Feb 14 '16 05:02 SimonKagstrom

Late followup, but maybe this symbol demangler bug is relevant?

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70909

rollbear avatar Nov 22 '16 07:11 rollbear

I don't think so. Kcov uses binutils / libbfd only when --verify is passed, and this bug is reproducible without it as well.

Anyway, I suppose a way to get the crash out of travis is to use uuencode or similar on the core file and then analyze it outside.

SimonKagstrom avatar Nov 22 '16 11:11 SimonKagstrom