unbound icon indicating copy to clipboard operation
unbound copied to clipboard

coredump with enable-subnet

Open fly542 opened this issue 2 years ago • 4 comments

Describe the bug we use our data do pressure test, unbound must coredump which complied with --enable-subnet from version of 1.15.0

To reproduce the coredump was caused by commit 773d1f29111b40445ec4bedf416c1f7b64c0605b, when remove per_upstream_opt_list = qstate->edns_opts_back_out; from services/outside_network.c will not coredump anymore.

Expected behavior A clear and concise description of what you expected to happen.

System:

  • Unbound version: 1.15.0
  • OS: centos7.2
  • unbound -V output: Version 1.15.0

Configure line: --prefix=/soft/unbound/ --with-conf-file=/soft/unbound/etc/unbound.conf --disable-rpath --enable-subnet Linked libs: mini-event internal (it uses select), OpenSSL 1.0.1e-fips 11 Feb 2013 Linked modules: dns64 subnetcache respip validator iterator

BSD licensed, see LICENSE in source package for details.

Additional information coredump stack

(gdb) bt
#0  0x00002b35bd241de6 in __memcmp_sse4_1 () from /lib64/libc.so.6
#1  0x000000000049834f in edns_opt_compare (q=0x2b35da1897b8, p=0x2b35da1897b8) at util/data/msgreply.c:1207
#2  edns_opt_list_compare (p=0x2b35da1897b8, q=0x2b35da1897b8) at util/data/msgreply.c:1215
#3  0x000000000046a15c in serviced_cmp (key1=0x2b35da190ed0, key2=0x2b35da190ed0) at services/outside_network.c:144
#4  0x000000000048c48a in rbtree_find_less_equal (rbtree=<optimized out>, key=<optimized out>, result=<optimized out>) at util/rbtree.c:527
#5  0x000000000048c85e in rbtree_search (rbtree=rbtree@entry=0x2b35d81315c0, key=<optimized out>) at util/rbtree.c:285
#6  0x000000000048c8e0 in rbtree_delete (rbtree=0x2b35d81315c0, key=<optimized out>) at util/rbtree.c:333
#7  0x0000000000433635 in outnet_serviced_query_stop (sq=<optimized out>, cb_arg=<optimized out>, sq=<optimized out>)
    at services/outside_network.c:3480
#8  0x0000000000433667 in outbound_list_clear (list=list@entry=0x2b35da1a1af8) at services/outbound_list.c:60
#9  0x00000000004336aa in iter_clear (qstate=qstate@entry=0x2b35da1a1600, id=id@entry=2) at iterator/iterator.c:3999
#10 0x000000000048f2a5 in mesh_state_cleanup (mstate=0x2b35da1a15b0) at services/mesh.c:908
#11 mesh_state_delete (qstate=<optimized out>, qstate=<optimized out>) at services/mesh.c:950
#12 0x000000000049086d in mesh_make_new_space (mesh=0x2b35d83f9dd0, qbuf=0x2b35d8002cd0) at services/mesh.c:357
#13 0x0000000000445fa2 in mesh_new_client (qid=49977, rep=0x2b35bfb019f0, edns=0x2b35bfb01580, qflags=256, cinfo=0x0, qinfo=0x2b35bfb01560, 
    mesh=0x2b35d83f9dd0) at services/mesh.c:478
#14 worker_handle_request (c=c@entry=0x2b35d81180a0, arg=0x207b2e0, error=error@entry=0, repinfo=repinfo@entry=0x2b35bfb019f0)
    at daemon/worker.c:1528
#15 0x0000000000475051 in comm_point_udp_ancil_callback (fd=13, event=event@entry=2, arg=<optimized out>) at util/netevent.c:724
#16 0x00000000004ad5d7 in handle_select.41039 (base=0x2b35d8000930, wait=<optimized out>) at util/mini_event.c:220
#17 0x000000000046c141 in minievent_base_dispatch (base=0x2b35d8000930) at util/mini_event.c:242
#18 ub_event_base_dispatch (base=0x2b35d8000930) at util/ub_event.c:280
#19 comm_base_dispatch (b=<optimized out>) at util/netevent.c:256
#20 0x000000000045602e in worker_work (worker=0x207b2e0) at daemon/worker.c:1921
#21 thread_start.6895 (arg=0x207b2e0) at daemon/daemon.c:541
#22 0x00002b35bcec3dd5 in start_thread () from /lib64/libpthread.so.0

fly542 avatar Apr 15 '22 05:04 fly542

Hi,

I'm trying to reproduce your crash. So far, I didn't succeed. I don't know if it makes a difference, but the most recent version of CentOS is 7.9.2009 so I tried that.

Since 1.15.0, there have been a few commits regarding subnet handling. Could you please test the current git head (1289c53c1ad698e51a7adf0271d63af992d78a33)?

If that also fails, them maybe you can help me reproduce the problem?

Philip

Philip-NLnetLabs avatar Apr 25 '22 08:04 Philip-NLnetLabs

Hi,

I'm trying to reproduce your crash. So far, I didn't succeed. I don't know if it makes a difference, but the most recent version of CentOS is 7.9.2009 so I tried that.

Since 1.15.0, there have been a few commits regarding subnet handling. Could you please test the current git head (1289c53)?

If that also fails, them maybe you can help me reproduce the problem?

Philip

use branch of 1289c53c1ad698e51a7adf0271d63af992d78a33 is alse cause coredump, the stack is fllow:

#0  0x00002af8bbe04de6 in __memcmp_sse4_1 () from /lib64/libc.so.6
#1  0x00000000004cee3b in edns_opt_compare (p=0x2af8edd0e0a8, q=0x2af8edd0e0a8) at util/data/msgreply.c:1207
#2  0x00000000004cee69 in edns_opt_list_compare (p=0x2af8edd0e0a8, q=0x2af8edd0e0a8) at util/data/msgreply.c:1215
#3  0x0000000000422e03 in serviced_cmp (key1=0x2af8edd157a0, key2=0x2af8edd157a0) at services/outside_network.c:144
#4  0x0000000000486ca6 in rbtree_find_less_equal (rbtree=0x2af8ec1315c0, key=0x2af8edd157a0, result=0x2af8bf0c94f8) at util/rbtree.c:527
#5  0x00000000004864a6 in rbtree_search (rbtree=0x2af8ec1315c0, key=0x2af8edd157a0) at util/rbtree.c:285
#6  0x00000000004865e3 in rbtree_delete (rbtree=0x2af8ec1315c0, key=0x2af8edd157a0) at util/rbtree.c:333
#7  0x00000000004180e4 in outnet_serviced_query_stop (sq=0x2af8edd157a0, cb_arg=0x2af8eddbb218) at services/outside_network.c:3483
#8  0x00000000004a4d34 in outbound_list_clear (list=0x2af8eddbaeb8) at services/outbound_list.c:60
#9  0x00000000004bc78c in iter_clear (qstate=0x2af8eddba9a0, id=2) at iterator/iterator.c:4001
#10 0x00000000004b085d in mesh_state_cleanup (mstate=0x2af8eddba950) at services/mesh.c:913
#11 0x00000000004b0aa5 in mesh_state_delete (qstate=0x2af8eddba9a0) at services/mesh.c:955
#12 0x00000000004af09a in mesh_make_new_space (mesh=0x2af8ec3f9dd0, qbuf=0x2af8ec002cd0) at services/mesh.c:357
#13 0x00000000004af5e6 in mesh_new_client (mesh=0x2af8ec3f9dd0, qinfo=0x2af8bf0c9910, cinfo=0x0, qflags=256, edns=0x2af8bf0c98e0, 
    rep=0x2af8bf0c9b40, qid=14329, rpz_passthru=0) at services/mesh.c:479
#14 0x00000000004e5b82 in worker_handle_request (c=0x2af8ec1180a0, arg=0xc12c40, error=0, repinfo=0x2af8bf0c9b40) at daemon/worker.c:1544
#15 0x0000000000429a9c in comm_point_udp_ancil_callback (fd=25, event=2, arg=0x2af8ec1180a0) at util/netevent.c:724
#16 0x00000000004818f6 in handle_select (base=0x2af8ec000930, wait=0x2af8bf0c9d60) at util/mini_event.c:220
#17 0x0000000000481989 in minievent_base_dispatch (base=0x2af8ec000930) at util/mini_event.c:242
#18 0x0000000000419030 in ub_event_base_dispatch (base=0x2af8ec000930) at util/ub_event.c:280
#19 0x0000000000428d65 in comm_base_dispatch (b=0x2af8ec0008c0) at util/netevent.c:256
#20 0x00000000004e6c4b in worker_work (worker=0xc12c40) at daemon/worker.c:1947
#21 0x00000000004f7718 in thread_start (arg=0xc12c40) at daemon/daemon.c:541
#22 0x00002af8bba86dd5 in start_thread () from /lib64/libpthread.so.0
#23 0x00002af8bbd9902d in clone () from /lib64/libc.so.6

fly542 avatar Apr 29 '22 04:04 fly542

Hi,

Pity that it still fails. Do you have anything to help me reproduce the problem?

Philip

Philip-NLnetLabs avatar May 02 '22 10:05 Philip-NLnetLabs

Hi,

Pity that it still fails. Do you have anything to help me reproduce the problem?

Philip

Send a mail to me, [email protected],I'll send you the test tool and test data file.

fly542 avatar May 05 '22 02:05 fly542