netopeer2
netopeer2 copied to clipboard
netopeer2-server crash when startTime in subscription is future time
Hi, We are observing that netopeer2-server (netopeer2-server 2.1.10) is crashing if startTime of future comes in the notification subscription, after sending failure response:
<?xml version="1.0" encoding="UTF-8"?><rpc xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="1657087369540">
<create-subscription xmlns="urn:ietf:params:xml:ns:netconf:notification:1.0">
<stream>NETCONF</stream>
<filter>
<NrCellCU-state-info xmlns="urn:3gpp:sa5:junila_3gpp-nr-nrm-nrcellcu"/>
<x2c-state-info xmlns="urn:3gpp:sa5:junila_3gpp-nr-nrm-nrcellcu"/>
</filter>
<startTime>2024-04-05T17:51:28.321Z</startTime>
</create-subscription>
</rpc>
CreateSubscription
<?xml version="1.0" encoding="UTF-8"?><rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="1657087369540">
<rpc-error>
<error-type>protocol</error-type>
<error-tag>bad-element</error-tag>
<error-severity>error</error-severity>
<error-message xml:lang="en">Specified "startTime" is in future.</error-message>
<error-info>
<bad-element>startTime</bad-element>
</error-info>
</rpc-error>
</rpc-reply>
crash bt:
Core was generated by netopeer2-server -dv2 -t 60'.
Program terminated with signal 11, Segmentation fault.
#0 0x00007f0cbcc6efd6 in __strcmp_sse42 () from /usr/lib64/libc.so.6
#0 0x00007f0cbcc6efd6 in __strcmp_sse42 () from /usr/lib64/libc.so.6
#1 0x0000000000409c71 in np2srv_err_reply_sr (err_info=0x7f0cac002040)
at /workspace/siqbuild/PRG-BUILDS/onecell_integration_6.5/6.5.2.083/CUCP/ran_3rdparty/open_source/netconf/netopeer2/src/main.c:286
#2 0x000000000040a038 in np2srv_rpc_cb (rpc=0x7f0cac001d10,
ncs=0x7f0ca8000c60)
at /workspace/siqbuild/PRG-BUILDS/onecell_integration_6.5/6.5.2.083/CUCP/ran_3rdparty/open_source/netconf/netopeer2/src/main.c:377
#3 0x00007f0cbdc1f78f in nc_ps_poll () from /usr/local/lib64/libnetconf2.so.3
#4 0x000000000040c6be in worker_thread (arg=0x1362820)
at /workspace/siqbuild/PRG-BUILDS/onecell_integration_6.5/6.5.2.083/CUCP/ran_3rdparty/open_source/netconf/netopeer2/src/main.c:1029
#5 0x00007f0cbd4d4ea5 in start_thread () from /usr/lib64/libpthread.so.0
#6 0x00007f0cbcc2eb0d in clone () from /usr/lib64/libc.so.6
Hi, Any update on this.
Hi, Anything on this please. Is this support being stopped? Just asking as we didnt get response for couple of issues raised in the past 2 -3 weeks.
Hi Michal, Can you please respond for this. This looks straight forward, somewhere errInfo is not having proper values set hence the crash. This is bit of a blocker issue for us.
Some more information on crash:
(gdb) #1 0x0000000000409c71 in np2srv_err_reply_sr (err_info=0x7f0398001de0) at netopeer2/src/main.c:304 304 e = nc_err(ly_ctx, NC_ERR_OP_FAILED, NC_ERR_TYPE_APP); (gdb) l 299 } 300 301 ly_ctx = sr_acquire_context(np2srv.sr_conn); 302 for (i = 0; i < err_info->err_count; ++i) { 303 /* generic error */ 304 e = nc_err(ly_ctx, NC_ERR_OP_FAILED, NC_ERR_TYPE_APP); 305 nc_err_set_msg(e, err_info->err[i].message, "en"); 306 307 if (reply) { 308 nc_server_reply_add_err(reply, e); (gdb) p err_info->err[0] $5 = {err_code = SR_ERR_INVAL_ARG, message = 0x7f0398002740 "NETCONF error occurred", error_format = 0xffffffff98002230 <Address 0xffffffff98002230 out of bounds>, error_data = 0x7f0398002540} (gdb) p err_info->err[1] $6 = {err_code = SR_ERR_CALLBACK_FAILED, message = 0x7f0398001dc0 "User callback failed.", error_format = 0x0, error_data = 0x0} (gdb) bt
Attached the:
- corefile
- lib
- netopeer2-server binary. crash-info.zip
I was out-of-office for 3 weeks, which is why the support stopped. But in this case the first thing you were supposed to try is use the current version 2.1.30. I could not reproduce the issue with it.
Hi Michal, No issues. Upgrading to latest would be bit long pole task. Can you please try it on netopeer2-server 2.1.10 once and check. This seems very straight forward crash but since i am not very well versed with the code, not able to fix it.
Hi Michal, Can you please respond on this. This crash is easily reproducible with version netopeer2 version 2.1.10 and sysrepo version libsysrepo v2.1.15 (SO v7.1.3) Following are simple steps to reproduce:
- run netopeer2-cli, netopeer2-server and application_changes_example (not sure if this is required, but i ran in my test)
- connect
- user-rpc --content = rpc.txt (file attached, please change the notification containers and use from your sample yangs)
- netopeer2-server will crash.
We really cannot afford to upgrade the netopeer2 version, so asking for help for this pointed fix.
I am sorry but you are asking for a support task that we do not provide (for free at least). We are fixing issues found in the latest code, which benefits us and all the users. What you want benefits only you and just wastes our time. I am quite sure this has already been reported based on what I have fixed it so just try to find the original issue to see the referenced commit.
Hi Michal, Ok no problem. Can you just help in giving some code flow for this. I see that the error is starting from np2srv_rpc_subscribe_cb: cur_ts = np_gettimespec(1); if (start.tv_sec && (np_difftimespec(&start, &cur_ts) < 0)) { np_err_bad_element(session, "startTime", "Specified "startTime" is in future."); rc = SR_ERR_INVAL_ARG; goto cleanup;
But after this where does the code flow and how it reaches np2srv_rpc_cb -> np2srv_err_reply_sr is not clear. If you can give some clue here, i will try to find out which commit fixed this issue. Thanks.
I think my previous suggestion is a better one because the code flow is quite complicated. The error is stores in the SR session, which is then communicated using sysrepo to another main RPC callback, which processes it into a NETCONF error, which is then sent to the NC client.
Hi Michal, Sometime back we had started netopeer2 upgrade activity, so we tried with that code same scenario, crash is still there. Here are the versions used: netopeer2-server 2.1.26 sysrepoctl - sysrepo YANG schema manipulation tool, compiled with libsysrepo v2.1.72 (SO v7.7.6)
So mostly the crash would exist in latest code also, please try the steps given earlier (https://github.com/CESNET/netopeer2/issues/1235#issuecomment-1188436874). Thanks.
I have just tested it with the current devel of all the projects (netopeer2 2.1.33), it worked just fine. netopeer2-cli output:
> user-rpc
ERROR
type: protocol
tag: bad-element
severity: error
message: Specified "startTime" is in future.
info:
<bad-element xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">startTime</bad-element>
Oh ok. Is there any way you can point us to the fix, since even in version netopeer2-server 2.1.26 this crash was there. It will be of great help, we are in final stages of testing and cannot risk an upgrade now. So looking for some help here please. You can tell us fix in netopeer2-server 2.1.26, that also will be fine. Thanks.
You are really testing my patience but I did actually test it with older versions but I could not even reproduce the crash. Try cleaning up your environment (to make sure some old version of the libraries are not used) or updating some of the dependency libraries, I am not able to help you.
Hi Michal, Thanks for you patience but we really have to fix this issue. Just confirming once, since i didnt mention this earlier. Following is our observation on this crash:
- run netopeer2-cli
- run netopeer2-server
- subscribe with startTime in future
- (3) works
- subscribe again with startTime in future ---> here crash happens.
So essentially first time, we get proper error at netopeer2-cli:
user-rpc --content rpc.txt ERROR type: protocol tag: bad-element severity: error message: Specified "startTime" is in future. info:
startTime
user-rpc --content rpc.txt ---------> This one crashes.
Request you to please try this use case, you can try with latest code also. I think this issue exists with latest code too. Thanks.
Tested, even with version 2.1.16, no crash occurred. I give up.
Ok we will check again. Just confirming, so in your test, even second time (without closing connection, on existing connection) subscribe got proper error of startTime in future? Since second time we are seeing the crash, not first time.
Yes, I have tried sending it even 5 times, both in the latest version and in 2.1.16.