dpvs
dpvs copied to clipboard
keepalived terminated with signal 6, Aborted
two keepalived conf files, only config item alpha is different. reload again and again, with usleep 300
# gdb ./keepalived core.90055
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /home/xxxx/dpvs/bin/keepalived...done.
[New LWP 90055]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./keepalived -d -f /etc/keepalived/keepalived.conf -S 6'.
Program terminated with signal 6, Aborted.
#0 0x00007f22b24701f7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-8.el7.x86_64 libcom_err-1.42.9-10.el7.x86_64 libselinux-2.5-11.el7.x86_64 nss-softokn-freebl-3.28.3-8.el7_4.x86_64 openssl-libs-1.0.2k-8.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x00007f22b24701f7 in raise () from /lib64/libc.so.6
#1 0x00007f22b24718e8 in abort () from /lib64/libc.so.6
#2 0x00007f22b2469266 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007f22b2469312 in __assert_fail () from /lib64/libc.so.6
#4 0x000000000041eede in keepalived_malloc (size=size@entry=24, file=file@entry=0x432731 "list.c", function=function@entry=0x4327bc <FUNCTION.3195> "alloc_element",
line=line@entry=36) at memory.c:119
#5 0x0000000000422c15 in alloc_element () at list.c:36
#6 list_add (l=0x2466910, data=0x2467e30) at list.c:43
#7 0x000000000041d769 in process_stream (keywords_vec=
That's an assert fail. I believe you have opened debug mode. In that mode, keepalived has set MAX_ALLOC_LIST==2048, that means you can only reload 2048 times at most. If you still want to use debug mode, you can edit MAX_ALLOC_LIST in lib/memory.h more bigger or you can stop using debug mode. If you want more infomation, you can read code in lib/memory.c.
I didn't reload that many times I think... I reload it about 113 times...
Sorry , I have a wrong description, this may mislead you. MAX_ALLOC_LIST doesn't mean reload times. I just find this issues in keepalived(https://github.com/acassen/keepalived/issues/390 ). So why not try it? And if you read code in lib/memory.c, you will find what I have said. This file is begin with "ifdef --debug"
Yes, I enabled debug to reproduce the last problem. I read that issue(acassen/keepalived#390) and I think I can disable vrrp process to have I try, is that ok? I didn't need vrrp online. Thank you very much, @mscbg :)
You can have a try, but I don't think it will work. keepalived_malloc()/REALLOC() is not only used in vrrp process.
I disabled debug. Configure and reload keepalived with two different conf files there are only real servers different. It is terminated with signal 6 about 10 times over a night. Keepalived is started with -C: ./keepalived -d -C -f /etc/keepalived/keepalived.conf -S 6
Program terminated with signal 6, Aborted. #0 0x00007f0295d981f7 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-8.el7.x86_64 libcom_err-1.42.9-10.el7.x86_64 libgcc-4.8.5-16.el7.x86_64 libselinux-2.5-11.el7.x86_64 nss-softokn-freebl-3.28.3-8.el7_4.x86_64 openssl-libs-1.0.2k-8.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-17.el7.x86_64 (gdb) bt #0 0x00007f0295d981f7 in raise () from /lib64/libc.so.6 #1 0x00007f0295d998e8 in abort () from /lib64/libc.so.6 #2 0x00007f0295dd7f47 in __libc_message () from /lib64/libc.so.6 #3 0x00007f0295dddb54 in malloc_printerr () from /lib64/libc.so.6 #4 0x00007f0295ddf7aa in _int_free () from /lib64/libc.so.6 #5 0x0000000000405615 in ?? () #6 0x000000000041facd in ?? () #7 0x000000000040575b in ?? () #8 0x00000000004030a1 in ?? () #9 0x00007f0295d84c05 in __libc_start_main () from /lib64/libc.so.6 #10 0x000000000040315a in ?? () (gdb)
or this: Core was generated by `./keepalived -d -C -f /etc/keepalived/keepalived.conf -S 6'. Program terminated with signal 11, Segmentation fault. #0 0x00007f0295ddf30b in _int_free () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-8.el7.x86_64 libcom_err-1.42.9-10.el7.x86_64 libselinux-2.5-11.el7.x86_64 nss-softokn-freebl-3.28.3-8.el7_4.x86_64 openssl-libs-1.0.2k-8.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-17.el7.x86_64 (gdb) bt #0 0x00007f0295ddf30b in _int_free () from /lib64/libc.so.6 #1 0x0000000000405615 in ?? () #2 0x000000000041facd in ?? () #3 0x000000000040575b in ?? () #4 0x0000000000405856 in ?? () #5 0x000000000041facd in ?? () #6 0x00000000004030b9 in ?? () #7 0x00007f0295d84c05 in __libc_start_main () from /lib64/libc.so.6 #8 0x000000000040315a in ?? () (gdb)
without strip, it should be this stack I think..
Program terminated with signal 6, Aborted.
#0 0x00007f9862d5c1f7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-8.el7.x86_64 libcom_err-1.42.9-10.el7.x86_64 libgcc-4.8.5-16.el7.x86_64 libselinux-2.5-11.el7.x86_64 nss-softokn-freebl-3.28.3-8.el7_4.x86_64 openssl-libs-1.0.2k-8.el7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x00007f9862d5c1f7 in raise () from /lib64/libc.so.6
#1 0x00007f9862d5d8e8 in abort () from /lib64/libc.so.6
#2 0x00007f9862d9bf47 in __libc_message () from /lib64/libc.so.6
#3 0x00007f9862da1b54 in malloc_printerr () from /lib64/libc.so.6
#4 0x00007f9862da37aa in _int_free () from /lib64/libc.so.6
#5 0x0000000000405615 in reload_check_thread (thread=
So,you run keepalived many times using cmd './keepalived -d -C -f /etc/keepalived/keepalived.conf -S 6'?. Did you see any log in /var/log/message like "198121 Jan 31 13:01:21 10 Keepalived[28307]: Healthcheck child process(951) died: Respawning"
And can you attach a configuration file? I will try to reproduce it. We have run keepalived online for a long time. And it seems works well. I believe if configured well, it will work stably. Anyway ,keepalived may really has many bugs, I saw so many crash inssues in github of keepalived.
seems related https://github.com/iqiyi/dpvs/issues/126