gdl
gdl copied to clipboard
Strange case of Segfault
I am getting constantly segmentation fault with my programs (that use call_externals) and various GDL versions. Who could help me to debug this?
With GDL version GDL> !gdl { 1.0.0-rc.2 Sep 24 2021 1632452400 1 0 1}
I get following result when calling with valgrind:
==22407== Warning: client switching stacks? SP change: 0x1ffeffd5c0 --> 0x1ffe7fd528 ==22407== to suppress, use: --max-stackframe=8388760 or greater ==22407== Invalid write of size 8 ==22407== at 0xC9215C: lib::copy_basic(char const*, char const*) (in /usr/local/bin/gdl) ==22407== Address 0x1ffe7fd528 is on thread 1's stack ==22407== ==22407== Can't extend stack to 0x1ffe7fc5e8 during signal delivery for thread 1: ==22407== no stack segment ==22407== ==22407== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==22407== Access not within mapped region at address 0x1FFE7FC5E8 ==22407== at 0xC9215C: lib::copy_basic(char const*, char const*) (in /usr/local/bin/gdl) ==22407== If you believe this happened as a result of a stack ==22407== overflow in your program's main thread (unlikely but ==22407== possible), you can try to increase the size of the ==22407== main thread stack using the --main-stacksize= flag. ==22407== The main thread stack size used in this run was 20480000. ==22407== Invalid write of size 8 ==22407== at 0x4835110: _vgnU_freeres (in /usr/libexec/valgrind/vgpreload_core-amd64-linux.so) ==22407== Address 0x1ffe7fd4a0 is on thread 1's stack ==22407== ==22407== ==22407== Process terminating with default action of signal 11 (SIGSEGV) ==22407== Access not within mapped region at address 0x1FFE7FD4A0 ==22407== at 0x4835110: _vgnU_freeres (in /usr/libexec/valgrind/vgpreload_core-amd64-linux.so) ==22407== If you believe this happened as a result of a stack ==22407== overflow in your program's main thread (unlikely but ==22407== possible), you can try to increase the size of the ==22407== main thread stack using the --main-stacksize= flag. ==22407== The main thread stack size used in this run was 20480000. ==22407== ==22407== HEAP SUMMARY: ==22407== in use at exit: 66,605,152 bytes in 152,630 blocks ==22407== total heap usage: 9,330,418 allocs, 9,177,788 frees, 3,032,494,405 bytes allocated ==22407== ==22407== LEAK SUMMARY: ==22407== definitely lost: 126,308 bytes in 236 blocks ==22407== indirectly lost: 53,482 bytes in 25 blocks ==22407== possibly lost: 3,072 bytes in 8 blocks ==22407== still reachable: 66,349,658 bytes in 151,844 blocks ==22407== of which reachable via heuristic: ==22407== newarray : 776 bytes in 1 blocks ==22407== multipleinheritance: 335,872 bytes in 2 blocks ==22407== suppressed: 0 bytes in 0 blocks ==22407== Rerun with --leak-check=full to see details of leaked memory ==22407== ==22407== Use --track-origins=yes to see where uninitialised values come from ==22407== For lists of detected and suppressed errors, rerun with: -s ==22407== ERROR SUMMARY: 7 errors from 3 contexts (suppressed: 0 from 0) Segmentation fault (core dumped)
Running with gdb gdl I get:
(gdb) bt
#0 0x00007ffff57e7f68 in _int_malloc (av=av@entry=0x7ffff5920a00 <main_arena>, bytes=bytes@entry=31) at malloc.c:3894
#1 0x00007ffff57e92f1 in __GI___libc_malloc (bytes=31) at malloc.c:3237
#2 0x00007ffff5a4396c in operator new(unsigned long) () at /lib64/libstdc++.so.6
#3 0x00007ffff5ad77d2 in std::__cxx11::basic_string<char, std::char_traits
This is what I get with GDL version
GDL> !gdl { "RELEASE": "1.0.0 Git", "BUILD_DATE": "Sep 23 2021", "EPOCH": 1632366000, "GDL_USE_DSFMT": 1, "GDL_USE_WX": 1, "GDL_POSIX": 1 }
==13621== Mismatched free() / delete / delete [] ==13621== at 0x484171B: operator delete(void*) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==13621== by 0x1086307: lib::draw_polyline(GDLGStream*, Data_<SpDDouble>, Data_<SpDDouble>, double, double, bool, bool, bool, int, bool, bool, Data_<SpDLong>) (in /usr/local/bin/gdl) ==13621== by 0x1043EC8: lib::oplot_call::old_body(EnvT, GDLGStream*) (in /usr/local/bin/gdl) ==13621== by 0x104270E: lib::oplot(EnvT*) (in /usr/local/bin/gdl) ==13621== by 0x1094CC8: PCALL_LIBNode::Run() (in /usr/local/bin/gdl) ==13621== by 0x6A0885: GDLInterpreter::statement(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x6A1BA7: GDLInterpreter::call_pro(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x1098AA2: PCALLNode::Run() (in /usr/local/bin/gdl) ==13621== by 0x6A0885: GDLInterpreter::statement(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x6A1BA7: GDLInterpreter::call_pro(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x1098AA2: PCALLNode::Run() (in /usr/local/bin/gdl) ==13621== by 0x6A0885: GDLInterpreter::statement(ProgNode*) (in /usr/local/bin/gdl) ==13621== Address 0xff3eaf0 is 0 bytes inside a block of size 20 alloc'd ==13621== at 0x483E7B5: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==13621== by 0x1085928: lib::draw_polyline(GDLGStream*, Data_<SpDDouble>, Data_<SpDDouble>, double, double, bool, bool, bool, int, bool, bool, Data_<SpDLong>) (in /usr/local/bin/gdl) ==13621== by 0x1043EC8: lib::oplot_call::old_body(EnvT, GDLGStream*) (in /usr/local/bin/gdl) ==13621== by 0x104270E: lib::oplot(EnvT*) (in /usr/local/bin/gdl) ==13621== by 0x1094CC8: PCALL_LIBNode::Run() (in /usr/local/bin/gdl) ==13621== by 0x6A0885: GDLInterpreter::statement(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x6A1BA7: GDLInterpreter::call_pro(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x1098AA2: PCALLNode::Run() (in /usr/local/bin/gdl) ==13621== by 0x6A0885: GDLInterpreter::statement(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x6A1BA7: GDLInterpreter::call_pro(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x1098AA2: PCALLNode::Run() (in /usr/local/bin/gdl) ==13621== by 0x6A0885: GDLInterpreter::statement(ProgNode*) (in /usr/local/bin/gdl) ==13621== ==13621== Mismatched free() / delete / delete [] ==13621== at 0x484171B: operator delete(void*) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==13621== by 0x1043EC8: lib::oplot_call::old_body(EnvT*, GDLGStream*) (in /usr/local/bin/gdl) ==13621== by 0x104270E: lib::oplot(EnvT*) (in /usr/local/bin/gdl) ==13621== by 0x1094CC8: PCALL_LIBNode::Run() (in /usr/local/bin/gdl) ==13621== by 0x6A0885: GDLInterpreter::statement(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x6A1BA7: GDLInterpreter::call_pro(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x1098AA2: PCALLNode::Run() (in /usr/local/bin/gdl) ==13621== by 0x6A0885: GDLInterpreter::statement(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x6A1BA7: GDLInterpreter::call_pro(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x1098AA2: PCALLNode::Run() (in /usr/local/bin/gdl) ==13621== by 0x6A0885: GDLInterpreter::statement(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x6A0B07: GDLInterpreter::interactive(ProgNode*) (in /usr/local/bin/gdl) ==13621== Address 0xfaff520 is 0 bytes inside a block of size 20 alloc'd ==13621== at 0x483E7B5: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==13621== by 0x108591D: lib::draw_polyline(GDLGStream*, Data_<SpDDouble>, Data_<SpDDouble>, double, double, bool, bool, bool, int, bool, bool, Data_<SpDLong>) (in /usr/local/bin/gdl) ==13621== by 0x1043EC8: lib::oplot_call::old_body(EnvT, GDLGStream*) (in /usr/local/bin/gdl) ==13621== by 0x104270E: lib::oplot(EnvT*) (in /usr/local/bin/gdl) ==13621== by 0x1094CC8: PCALL_LIBNode::Run() (in /usr/local/bin/gdl) ==13621== by 0x6A0885: GDLInterpreter::statement(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x6A1BA7: GDLInterpreter::call_pro(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x1098AA2: PCALLNode::Run() (in /usr/local/bin/gdl) ==13621== by 0x6A0885: GDLInterpreter::statement(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x6A1BA7: GDLInterpreter::call_pro(ProgNode*) (in /usr/local/bin/gdl) ==13621== by 0x1098AA2: PCALLNode::Run() (in /usr/local/bin/gdl) ==13621== by 0x6A0885: GDLInterpreter::statement(ProgNode*) (in /usr/local/bin/gdl) ==13621==
and later on
==13621== Warning: client switching stacks? SP change: 0x1ffeffd580 --> 0x1ffe7fd4e8 ==13621== to suppress, use: --max-stackframe=8388760 or greater ==13621== Invalid write of size 8 ==13621== at 0xCB60CC: lib::copy_basic(char const*, char const*) (in /usr/local/bin/gdl) ==13621== Address 0x1ffe7fd4e8 is on thread 1's stack ==13621== ==13621== ==13621== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==13621== Access not within mapped region at address 0x1FFE7FD4E8 ==13621== at 0xCB60CC: lib::copy_basic(char const*, char const*) (in /usr/local/bin/gdl) ==13621== If you believe this happened as a result of a stack ==13621== overflow in your program's main thread (unlikely but ==13621== possible), you can try to increase the size of the ==13621== main thread stack using the --main-stacksize= flag. ==13621== The main thread stack size used in this run was 20480000. ==13621== Invalid write of size 8 ==13621== at 0x4835110: _vgnU_freeres (in /usr/libexec/valgrind/vgpreload_core-amd64-linux.so) ==13621== Address 0x1ffe7fd4e0 is on thread 1's stack ==13621== ==13621== ==13621== Process terminating with default action of signal 11 (SIGSEGV) ==13621== Access not within mapped region at address 0x1FFE7FD4E0 ==13621== at 0x4835110: _vgnU_freeres (in /usr/libexec/valgrind/vgpreload_core-amd64-linux.so) ==13621== If you believe this happened as a result of a stack ==13621== overflow in your program's main thread (unlikely but ==13621== possible), you can try to increase the size of the ==13621== main thread stack using the --main-stacksize= flag. ==13621== The main thread stack size used in this run was 20480000. ==13621== ==13621== HEAP SUMMARY: ==13621== in use at exit: 66,359,568 bytes in 169,506 blocks ==13621== total heap usage: 9,344,636 allocs, 9,175,130 frees, 3,017,718,309 bytes allocated ==13621== ==13621== LEAK SUMMARY: ==13621== definitely lost: 239,620 bytes in 696 blocks ==13621== indirectly lost: 135,229 bytes in 5,338 blocks ==13621== possibly lost: 5,890 bytes in 79 blocks ==13621== still reachable: 65,885,021 bytes in 162,720 blocks ==13621== of which reachable via heuristic: ==13621== newarray : 776 bytes in 1 blocks ==13621== multipleinheritance: 270,336 bytes in 1 blocks ==13621== suppressed: 0 bytes in 0 blocks ==13621== Rerun with --leak-check=full to see details of leaked memory ==13621== ==13621== For lists of detected and suppressed errors, rerun with: -s ==13621== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0) Segmentation fault (core dumped)
How could I debug further? With which parameters should I call valgrind? Is GDB better for debugging?
Any hint is welcome. Thanks
it looks like a double free() in draw_polyline. Would you by any chance use PLOT or PLOTS on a data array that has been retrieved externally (through call_external as you mention it)?
Thanks for the report :( Could you try to extract the line or the few lines in GDL which give the crash ? After that, it is very simple locally to check with Valgring or when adding flags at compilation. Without being able to reproduce/trigger the issue, it is more difficult. As an illustration https://github.com/gnudatalanguage/gdl/issues/1030#issuecomment-888653641
I tried many things to find the line but no success yet. The problem is that if I call
GDL>program_a GDL>program_b
then I get segfault in program_b
If I call
GDL>program_b GDL>program_a
then I get segfault in program_a
Very weird .
On 25. Sep 2021, at 16:52, Alain @.***> wrote:
Thanks for the report :( Could you try to extract the line or the few lines in GDL which give the crash ? After that, it is very simple locally to check with Valgring or when adding flags at compilation. Without being able to reproduce/trigger the issue, it is more difficult.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/gnudatalanguage/gdl/issues/1147#issuecomment-927175089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOC5K6EGPSS47BWWPDEPJ4DUDYSANANCNFSM5EWJKXDQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Well, for now it seems like if I do: GDL>.compile program_b GDL>program_a GDL>program_b
does not lead to segfault. I tried already some combinations and no segfault any more if I compile the program(s) in the beginning.