antlr4
antlr4 copied to clipboard
[Cpp] Segmentation fault (core dumped) on s390x
When running demo https://github.com/antlr/antlr4/blob/master/runtime/Cpp/demo/Linux/main.cpp with some illegal input string (e.g. "\\\\") on s390x platform, segmentation fault happens at https://github.com/antlr/antlr4/blob/master/runtime/Cpp/runtime/src/Lexer.cpp#L75 during _text = ""; assignment.
Stacktrace:
(lldb) bt
* thread #1, name = 'demo', stop reason = signal SIGSEGV: invalid address (fault address: 0x10bd000)
* frame #0: 0x000003fffd9ab8cc libc.so.6`memcpy + 92
frame #1: 0x000003fffdde921a libstdc++.so.6`std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) + 170
frame #2: 0x000003fffddea56e libstdc++.so.6`std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace(unsigned long, unsigned long, char const*, unsigned long) + 350
frame #3: 0x0000000001016b7e demo`antlr4::Lexer::nextToken() [inlined] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::assign(this="", __s=<unavailable>) at basic_string.h:1459:9
frame #4: 0x0000000001016b6a demo`antlr4::Lexer::nextToken() [inlined] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator=(this="", __s=<unavailable>) at basic_string.h:690:22
frame #5: 0x0000000001016b6a demo`antlr4::Lexer::nextToken(this=0x000003fffffff148) at Lexer.cpp:75:11
frame #6: 0x00000000010131f2 demo`antlr4::BufferedTokenStream::fetch(this=0x000003fffffff108, n=1) at BufferedTokenStream.cpp:96:44
frame #7: 0x000000000101314e demo`antlr4::BufferedTokenStream::sync(this=<unavailable>, i=<unavailable>) at BufferedTokenStream.cpp:82:22
frame #8: 0x0000000001013a04 demo`antlr4::BufferedTokenStream::setup(this=0x000003fffffff108) at BufferedTokenStream.cpp:188:3
frame #9: 0x00000000010152e8 demo`antlr4::BufferedTokenStream::fill() [inlined] antlr4::BufferedTokenStream::lazyInit(this=0x000003fffffff108) at BufferedTokenStream.cpp:182:5
frame #10: 0x00000000010152ce demo`antlr4::BufferedTokenStream::fill(this=0x000003fffffff108) at BufferedTokenStream.cpp:401:3
frame #11: 0x0000000001006350 demo`main((null)=<unavailable>, (null)=<unavailable>) at main.cpp:27:10
frame #12: 0x000003fffd9b17f2 libc.so.6`__libc_start_call_main + 146
frame #13: 0x000003fffd9b18d0 libc.so.6`__libc_start_main@@GLIBC_2.34 + 160
frame #14: 0x0000000001006234 demo`_start + 64
(lldb)
Memory check:
[root@3fdfd0510543 build]# valgrind --leak-check=full ./demo
==119691== Memcheck, a memory error detector
==119691== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==119691== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==119691== Command: ./demo
==119691==
line 1:0 token recognition error at: '\'
==119691== Source and destination overlap in memcpy(0x511f040, 0x10a2e6a, 81058152)
==119691== at 0x4843D00: memcpy (vg_replace_strmem.c:1120)
==119691== by 0x49BE219: UnknownInlinedFun (basic_string.tcc:315)
==119691== by 0x49BE219: UnknownInlinedFun (basic_string.h:359)
==119691== by 0x49BE219: UnknownInlinedFun (basic_string.h:354)
==119691== by 0x49BE219: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) (basic_string.tcc:312)
==119691== by 0x49BF56D: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace(unsigned long, unsigned long, char const*, unsigned long) (basic_string.tcc:498)
==119691== by 0x1016B7D: assign (basic_string.h:1459)
==119691== by 0x1016B7D: operator= (basic_string.h:690)
==119691== by 0x1016B7D: antlr4::Lexer::nextToken() (Lexer.cpp:75)
==119691== by 0x10131F1: antlr4::BufferedTokenStream::fetch(unsigned long) (BufferedTokenStream.cpp:96)
==119691== by 0x101314D: antlr4::BufferedTokenStream::sync(unsigned long) (BufferedTokenStream.cpp:82)
==119691== by 0x1013A03: antlr4::BufferedTokenStream::setup() (BufferedTokenStream.cpp:188)
==119691== by 0x10152E7: lazyInit (BufferedTokenStream.cpp:182)
==119691== by 0x10152E7: antlr4::BufferedTokenStream::fill() (BufferedTokenStream.cpp:401)
==119691== by 0x100634F: main (main.cpp:27)
==119691==
==119691== Invalid read of size 2
==119691== at 0x484400E: memcpy (vg_replace_strmem.c:1120)
==119691== by 0x49BE219: UnknownInlinedFun (basic_string.tcc:315)
==119691== by 0x49BE219: UnknownInlinedFun (basic_string.h:359)
==119691== by 0x49BE219: UnknownInlinedFun (basic_string.h:354)
==119691== by 0x49BE219: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) (basic_string.tcc:312)
==119691== by 0x49BF56D: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace(unsigned long, unsigned long, char const*, unsigned long) (basic_string.tcc:498)
==119691== by 0x1016B7D: assign (basic_string.h:1459)
==119691== by 0x1016B7D: operator= (basic_string.h:690)
==119691== by 0x1016B7D: antlr4::Lexer::nextToken() (Lexer.cpp:75)
==119691== by 0x10131F1: antlr4::BufferedTokenStream::fetch(unsigned long) (BufferedTokenStream.cpp:96)
==119691== by 0x101314D: antlr4::BufferedTokenStream::sync(unsigned long) (BufferedTokenStream.cpp:82)
==119691== by 0x1013A03: antlr4::BufferedTokenStream::setup() (BufferedTokenStream.cpp:188)
==119691== by 0x10152E7: lazyInit (BufferedTokenStream.cpp:182)
==119691== by 0x10152E7: antlr4::BufferedTokenStream::fill() (BufferedTokenStream.cpp:401)
==119691== by 0x100634F: main (main.cpp:27)
==119691== Address 0x511f03e is 2 bytes before a block of size 81,058,153 alloc'd
==119691== at 0x483BAA8: operator new(unsigned long) (vg_replace_malloc.c:422)
==119691== by 0x49BE1D1: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) (basic_string.tcc:309)
==119691== by 0x49BF56D: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace(unsigned long, unsigned long, char const*, unsigned long) (basic_string.tcc:498)
==119691== by 0x1016B7D: assign (basic_string.h:1459)
==119691== by 0x1016B7D: operator= (basic_string.h:690)
==119691== by 0x1016B7D: antlr4::Lexer::nextToken() (Lexer.cpp:75)
==119691== by 0x10131F1: antlr4::BufferedTokenStream::fetch(unsigned long) (BufferedTokenStream.cpp:96)
==119691== by 0x101314D: antlr4::BufferedTokenStream::sync(unsigned long) (BufferedTokenStream.cpp:82)
==119691== by 0x1013A03: antlr4::BufferedTokenStream::setup() (BufferedTokenStream.cpp:188)
==119691== by 0x10152E7: lazyInit (BufferedTokenStream.cpp:182)
==119691== by 0x10152E7: antlr4::BufferedTokenStream::fill() (BufferedTokenStream.cpp:401)
==119691== by 0x100634F: main (main.cpp:27)
==119691==
==119691==
==119691== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==119691== Access not within mapped region at address 0x4852000
==119691== at 0x484400E: memcpy (vg_replace_strmem.c:1120)
==119691== by 0x49BE219: UnknownInlinedFun (basic_string.tcc:315)
==119691== by 0x49BE219: UnknownInlinedFun (basic_string.h:359)
==119691== by 0x49BE219: UnknownInlinedFun (basic_string.h:354)
==119691== by 0x49BE219: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) (basic_string.tcc:312)
==119691== by 0x49BF56D: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace(unsigned long, unsigned long, char const*, unsigned long) (basic_string.tcc:498)
==119691== by 0x1016B7D: assign (basic_string.h:1459)
==119691== by 0x1016B7D: operator= (basic_string.h:690)
==119691== by 0x1016B7D: antlr4::Lexer::nextToken() (Lexer.cpp:75)
==119691== by 0x10131F1: antlr4::BufferedTokenStream::fetch(unsigned long) (BufferedTokenStream.cpp:96)
==119691== by 0x101314D: antlr4::BufferedTokenStream::sync(unsigned long) (BufferedTokenStream.cpp:82)
==119691== by 0x1013A03: antlr4::BufferedTokenStream::setup() (BufferedTokenStream.cpp:188)
==119691== by 0x10152E7: lazyInit (BufferedTokenStream.cpp:182)
==119691== by 0x10152E7: antlr4::BufferedTokenStream::fill() (BufferedTokenStream.cpp:401)
==119691== by 0x100634F: main (main.cpp:27)
==119691== If you believe this happened as a result of a stack
==119691== overflow in your program's main thread (unlikely but
==119691== possible), you can try to increase the size of the
==119691== main thread stack using the --main-stacksize= flag.
==119691== The main thread stack size used in this run was 8388608.
==119691==
==119691== HEAP SUMMARY:
==119691== in use at exit: 81,180,562 bytes in 703 blocks
==119691== total heap usage: 816 allocs, 113 frees, 81,194,521 bytes allocated
==119691==
==119691== LEAK SUMMARY:
==119691== definitely lost: 0 bytes in 0 blocks
==119691== indirectly lost: 0 bytes in 0 blocks
==119691== possibly lost: 0 bytes in 0 blocks
==119691== still reachable: 81,180,562 bytes in 703 blocks
==119691== suppressed: 0 bytes in 0 blocks
==119691== Reachable blocks (those to which a pointer was found) are not shown.
==119691== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==119691==
==119691== For lists of detected and suppressed errors, rerun with: -s
==119691== ERROR SUMMARY: 2035982 errors from 2 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
[root@3fdfd0510543 build]#
For some reason std::string is trying to copy 81MB. Not sure if this is antlr4 or std::string problem.
The error happens only when using clang with optimisation (-O1,-O2,-O3). In other cases (gcc, or clang without optimisation) the error does not happen.
The error dissapears when replacing _text = ""; with setText(""); or with _text = std::string(""); in Lexer.cpp.