netopeer2 icon indicating copy to clipboard operation
netopeer2 copied to clipboard

netopeer2-server crash when startTime in subscription is future time

Open meghna-doshi opened this issue 3 years ago • 18 comments

Hi, We are observing that netopeer2-server (netopeer2-server 2.1.10) is crashing if startTime of future comes in the notification subscription, after sending failure response:

<?xml version="1.0" encoding="UTF-8"?><rpc xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="1657087369540">
<create-subscription xmlns="urn:ietf:params:xml:ns:netconf:notification:1.0">
<stream>NETCONF</stream>
<filter>
<NrCellCU-state-info xmlns="urn:3gpp:sa5:junila_3gpp-nr-nrm-nrcellcu"/>
<x2c-state-info xmlns="urn:3gpp:sa5:junila_3gpp-nr-nrm-nrcellcu"/>
</filter>
<startTime>2024-04-05T17:51:28.321Z</startTime>
</create-subscription>
</rpc>

 

CreateSubscription

<?xml version="1.0" encoding="UTF-8"?><rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="1657087369540">
<rpc-error>
<error-type>protocol</error-type>
<error-tag>bad-element</error-tag>
<error-severity>error</error-severity>
<error-message xml:lang="en">Specified "startTime" is in future.</error-message>
<error-info>
<bad-element>startTime</bad-element>
</error-info>
</rpc-error>
</rpc-reply>

crash bt:

Core was generated by netopeer2-server -dv2 -t 60'.

Program terminated with signal 11, Segmentation fault.

#0  0x00007f0cbcc6efd6 in __strcmp_sse42 () from /usr/lib64/libc.so.6

#0  0x00007f0cbcc6efd6 in __strcmp_sse42 () from /usr/lib64/libc.so.6

#1  0x0000000000409c71 in np2srv_err_reply_sr (err_info=0x7f0cac002040)

at /workspace/siqbuild/PRG-BUILDS/onecell_integration_6.5/6.5.2.083/CUCP/ran_3rdparty/open_source/netconf/netopeer2/src/main.c:286

#2  0x000000000040a038 in np2srv_rpc_cb (rpc=0x7f0cac001d10,

ncs=0x7f0ca8000c60)

at /workspace/siqbuild/PRG-BUILDS/onecell_integration_6.5/6.5.2.083/CUCP/ran_3rdparty/open_source/netconf/netopeer2/src/main.c:377

#3  0x00007f0cbdc1f78f in nc_ps_poll () from /usr/local/lib64/libnetconf2.so.3

#4  0x000000000040c6be in worker_thread (arg=0x1362820)

at /workspace/siqbuild/PRG-BUILDS/onecell_integration_6.5/6.5.2.083/CUCP/ran_3rdparty/open_source/netconf/netopeer2/src/main.c:1029

#5  0x00007f0cbd4d4ea5 in start_thread () from /usr/lib64/libpthread.so.0

#6  0x00007f0cbcc2eb0d in clone () from /usr/lib64/libc.so.6

meghna-doshi avatar Jul 06 '22 06:07 meghna-doshi

Hi, Any update on this.

meghna-doshi avatar Jul 13 '22 05:07 meghna-doshi

Hi, Anything on this please. Is this support being stopped? Just asking as we didnt get response for couple of issues raised in the past 2 -3 weeks.

meghna-doshi avatar Jul 14 '22 22:07 meghna-doshi

Hi Michal, Can you please respond for this. This looks straight forward, somewhere errInfo is not having proper values set hence the crash. This is bit of a blocker issue for us.

meghna-doshi avatar Jul 15 '22 03:07 meghna-doshi

Some more information on crash:

(gdb) #1 0x0000000000409c71 in np2srv_err_reply_sr (err_info=0x7f0398001de0) at netopeer2/src/main.c:304 304 e = nc_err(ly_ctx, NC_ERR_OP_FAILED, NC_ERR_TYPE_APP); (gdb) l 299 } 300 301 ly_ctx = sr_acquire_context(np2srv.sr_conn); 302 for (i = 0; i < err_info->err_count; ++i) { 303 /* generic error */ 304 e = nc_err(ly_ctx, NC_ERR_OP_FAILED, NC_ERR_TYPE_APP); 305 nc_err_set_msg(e, err_info->err[i].message, "en"); 306 307 if (reply) { 308 nc_server_reply_add_err(reply, e); (gdb) p err_info->err[0] $5 = {err_code = SR_ERR_INVAL_ARG, message = 0x7f0398002740 "NETCONF error occurred", error_format = 0xffffffff98002230 <Address 0xffffffff98002230 out of bounds>, error_data = 0x7f0398002540} (gdb) p err_info->err[1] $6 = {err_code = SR_ERR_CALLBACK_FAILED, message = 0x7f0398001dc0 "User callback failed.", error_format = 0x0, error_data = 0x0} (gdb) bt

Attached the:

  1. corefile
  2. lib
  3. netopeer2-server binary. crash-info.zip

meghna-doshi avatar Jul 15 '22 04:07 meghna-doshi

I was out-of-office for 3 weeks, which is why the support stopped. But in this case the first thing you were supposed to try is use the current version 2.1.30. I could not reproduce the issue with it.

michalvasko avatar Jul 15 '22 06:07 michalvasko

Hi Michal, No issues. Upgrading to latest would be bit long pole task. Can you please try it on netopeer2-server 2.1.10 once and check. This seems very straight forward crash but since i am not very well versed with the code, not able to fix it.

meghna-doshi avatar Jul 15 '22 17:07 meghna-doshi

Hi Michal, Can you please respond on this. This crash is easily reproducible with version netopeer2 version 2.1.10 and sysrepo version libsysrepo v2.1.15 (SO v7.1.3) Following are simple steps to reproduce:

  1. run netopeer2-cli, netopeer2-server and application_changes_example (not sure if this is required, but i ran in my test)
  2. connect
  3. user-rpc --content = rpc.txt (file attached, please change the notification containers and use from your sample yangs)
  4. netopeer2-server will crash.

We really cannot afford to upgrade the netopeer2 version, so asking for help for this pointed fix.

rpc.txt

meghna-doshi avatar Jul 18 '22 23:07 meghna-doshi

I am sorry but you are asking for a support task that we do not provide (for free at least). We are fixing issues found in the latest code, which benefits us and all the users. What you want benefits only you and just wastes our time. I am quite sure this has already been reported based on what I have fixed it so just try to find the original issue to see the referenced commit.

michalvasko avatar Jul 19 '22 07:07 michalvasko

Hi Michal, Ok no problem. Can you just help in giving some code flow for this. I see that the error is starting from np2srv_rpc_subscribe_cb: cur_ts = np_gettimespec(1); if (start.tv_sec && (np_difftimespec(&start, &cur_ts) < 0)) { np_err_bad_element(session, "startTime", "Specified "startTime" is in future."); rc = SR_ERR_INVAL_ARG; goto cleanup;

But after this where does the code flow and how it reaches np2srv_rpc_cb -> np2srv_err_reply_sr is not clear. If you can give some clue here, i will try to find out which commit fixed this issue. Thanks.

meghna-doshi avatar Jul 19 '22 07:07 meghna-doshi

I think my previous suggestion is a better one because the code flow is quite complicated. The error is stores in the SR session, which is then communicated using sysrepo to another main RPC callback, which processes it into a NETCONF error, which is then sent to the NC client.

michalvasko avatar Jul 19 '22 07:07 michalvasko

Hi Michal, Sometime back we had started netopeer2 upgrade activity, so we tried with that code same scenario, crash is still there. Here are the versions used: netopeer2-server 2.1.26 sysrepoctl - sysrepo YANG schema manipulation tool, compiled with libsysrepo v2.1.72 (SO v7.7.6)

So mostly the crash would exist in latest code also, please try the steps given earlier (https://github.com/CESNET/netopeer2/issues/1235#issuecomment-1188436874). Thanks.

meghna-doshi avatar Jul 19 '22 19:07 meghna-doshi

I have just tested it with the current devel of all the projects (netopeer2 2.1.33), it worked just fine. netopeer2-cli output:

> user-rpc 
ERROR
        type:     protocol
        tag:      bad-element
        severity: error
        message:  Specified "startTime" is in future.
        info:
<bad-element xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">startTime</bad-element>

michalvasko avatar Jul 20 '22 12:07 michalvasko

Oh ok. Is there any way you can point us to the fix, since even in version netopeer2-server 2.1.26 this crash was there. It will be of great help, we are in final stages of testing and cannot risk an upgrade now. So looking for some help here please. You can tell us fix in netopeer2-server 2.1.26, that also will be fine. Thanks.

meghna-doshi avatar Jul 20 '22 18:07 meghna-doshi

You are really testing my patience but I did actually test it with older versions but I could not even reproduce the crash. Try cleaning up your environment (to make sure some old version of the libraries are not used) or updating some of the dependency libraries, I am not able to help you.

michalvasko avatar Jul 22 '22 06:07 michalvasko

Hi Michal, Thanks for you patience but we really have to fix this issue. Just confirming once, since i didnt mention this earlier. Following is our observation on this crash:

  1. run netopeer2-cli
  2. run netopeer2-server
  3. subscribe with startTime in future
  4. (3) works
  5. subscribe again with startTime in future ---> here crash happens.

So essentially first time, we get proper error at netopeer2-cli:

user-rpc --content rpc.txt ERROR type: protocol tag: bad-element severity: error message: Specified "startTime" is in future. info: startTime

user-rpc --content rpc.txt ---------> This one crashes.

Request you to please try this use case, you can try with latest code also. I think this issue exists with latest code too. Thanks.

meghna-doshi avatar Jul 26 '22 00:07 meghna-doshi

Tested, even with version 2.1.16, no crash occurred. I give up.

michalvasko avatar Jul 26 '22 06:07 michalvasko

Ok we will check again. Just confirming, so in your test, even second time (without closing connection, on existing connection) subscribe got proper error of startTime in future? Since second time we are seeing the crash, not first time.

meghna-doshi avatar Jul 26 '22 17:07 meghna-doshi

Yes, I have tried sending it even 5 times, both in the latest version and in 2.1.16.

michalvasko avatar Jul 27 '22 06:07 michalvasko