libcoap icon indicating copy to clipboard operation
libcoap copied to clipboard

libcoap 4.3.0 build with libcoap

Open fun-works opened this issue 1 year ago • 30 comments

libcoap version 4.3.0

I am trying to build libcoap with LwIP + FreeRTOS for a ST device I am getting the following error when I am trying to configure and build with USE_CUSTOM_POOL in LwIP. include/lwip/stats.h:271:26: error: 'MEMP_MAX' undeclared here (not in a function)

Following is the include stack:

In file included from /home/projects/playground1/Component/lwip/src/lwip/src/include/lwip/netif.h:50,
                 from /home/projects/playground1/Component/lwip/src/lwip/src/include/lwip/sockets.h:47,
                 from /home/projects/playground1/build/debug/_deps/libcoap-src/include/coap3/net.h:24,
                 from /home/projects/playground1/build/debug/_deps/libcoap-src/include/coap3/async.h:20,
                 from /home/projects/playground1/build/debug/Component/stack/src/stack/_libcoap_EP-prefix/src/_libcoap_EP-build/include/coap3/coap.h:42,
                 from /home/projects/playground1/build/debug/_deps/libcoap-src/include/coap3/coap_internal.h:38,
                 from /home/projects/playground1/Component/lwip/src/porting/user/lwippools.h:14,
                 from /home/projects/playground1/Component/lwip/src/lwip/src/include/lwip/priv/memp_std.h:142,
                 from /home/projects/playground1/Component/lwip/src/lwip/src/include/lwip/memp.h:49,
                 from /home/projects/playground1/Component/lwip/src/lwip/src/api/api_lib.c:63:
/home/projects/playground1/Component/lwip/src/lwip/src/include/lwip/stats.h:271:26: error: 'MEMP_MAX' undeclared here (not in a function)
  271 |   struct stats_mem *memp[MEMP_MAX];

Can you please help me understand how I can configure the pool and its options to run it successfully ?

Do I need any special commit of LwIP or am I missing any options ?

fun-works avatar Jul 20 '23 09:07 fun-works

As mentioned previously, 4.3.0 is somewhat old and things have moved on since then, including updates to LwIP. I would recommend you try the latest libcoap develop branch to see if there are in any issues there.

MEMP_MAX is defined in the LwIP source in src/include/lwip/memp.h.

For 4.3.0, the supported lwip branch is STABLE-2_0_3_RELEASE and the supported lwip-contrib branch is STABLE-2_0_1_RELEASE.

For the latest libcoap code, it is (lwip) STABLE-2_1_3_RELEASE and (lwip-contrib) STABLE-2_1_0_RELEASE.

With the correct lwip branches, the 4.3.0 and develop code build with no errors, but they use their own variant of lwipopts.h and lwippools.h.

mrdeep1 avatar Jul 20 '23 10:07 mrdeep1

Ok, So I have updated to latest libcoap now. And I am trying to build with FreeRTOS with:

#define NO_SYS                     0
#define SYS_FREERTOS                       1

And I am getting following error:

/home/projects/playground1/build/debug/_deps/libcoap-src/src/coap_subscribe.c:60:3: warning: implicit declaration of function 'COAP_MUTEX_DEFINE' [-Wimplicit-function-declaration]
   60 |   COAP_MUTEX_DEFINE(e_static_mutex);
      |   ^~~~~~~~~~~~~~~~~
/home/projects/playground1/build/debug/_deps/libcoap-src/src/coap_subscribe.c:60:21: error: 'e_static_mutex' undeclared (first use in this function)
   60 |   COAP_MUTEX_DEFINE(e_static_mutex);

Seems like I need to define COAP_MUTEX_DEFINE, but not sure where ?

fun-works avatar Jul 20 '23 13:07 fun-works

well, I added the line:

#define COAP_MUTEX_DEFINE(_name)                        \
  static sys_mutex_t _name

@coap_mutex_internal.h:63 and it is successful. But I have some other errors now.

But I think mutex is is missing on libcoap for now and you need to fix that.

fun-works avatar Jul 20 '23 13:07 fun-works

Thanks for your help troubleshooting here.

But I think mutex is is missing on libcoap for now and you need to fix that.

#1181 Raised for this.

error: 'coap_layer_func_t' has no member named 'lwip_write'

Looks like there is a #define write lwip_write somewhere. I will go through and create a PR so that we do not get name clashes when using things like write.

The conversion warnings need to be checked though, but are unlikely do be causing any issues.

mrdeep1 avatar Jul 20 '23 13:07 mrdeep1

I will go through and create a PR so that we do not get name clashes when using things like write.

This should now be fixed in the latest version of the develop branch.

mrdeep1 avatar Jul 20 '23 14:07 mrdeep1

#1181 Raised for this.

Is there a specific reason to use a variable name that starts with an underscore (_)? This is considered bad practice because this naming pattern is reserved for system-internal use.

obgm avatar Jul 20 '23 16:07 obgm

Good question. This was following how all of the other COAP_MUTEX_DEFINE() variants have been defined. Actual variable does not end up with a leading _, but all the usage of _name can be changed.

mrdeep1 avatar Jul 20 '23 16:07 mrdeep1

Thanks for your help troubleshooting here.

But I think mutex is is missing on libcoap for now and you need to fix that.

#1181 Raised for this.

error: 'coap_layer_func_t' has no member named 'lwip_write'

Looks like there is a #define write lwip_write somewhere. I will go through and create a PR so that we do not get name clashes when using things like write.

The conversion warnings need to be checked though, but are unlikely do be causing any issues.

Yes, that was somewhere else in my project to have posix compatible calls. I fixed by disabling the option, however it can also be avoided like you mentioned. Thanks.

fun-works avatar Jul 20 '23 17:07 fun-works

Btw, is there any plan to have a release with all these lwip changes ? And is it possible to provide cmake support as well for lwip ?

fun-works avatar Jul 20 '23 17:07 fun-works

Btw, is there any plan to have a release with all these lwip changes ?

We are shortly going to be releasing 4.3.2 release candidate 1 (4.3.2rc1) which includes the LwIP changes.

And is it possible to provide cmake support as well for lwip ?

It just needs someone to create the appropriate cmake files for building LwIP with libcoap and submit a PR. I see that in LwIP STABLE-2_1_3_RELEASE is a CMakeLists.txt.

mrdeep1 avatar Jul 21 '23 09:07 mrdeep1

I updated to the latest libcoap and lwip to 2.1.3. Still it is crashing on coap_net.c:473: c = coap_malloc_type(COAP_CONTEXT, sizeof(coap_context_t));

I am trying to run it using FreeRTOS on embedded device.

I have set MEMP_USE_CUSTOM_POOL to 1 already. Do I need to implement anything for memory pool ? Any hints please ?

fun-works avatar Jul 21 '23 12:07 fun-works

Is it that c is getting returned as NULL, or is it crashing somewhere in coap_malloc_type() / memp_malloc() ?

In examples/lwip/config/lwipopts.h the libcoap example for LwIP has #define MEMP_USE_CUSTOM_POOLS 1 , and the (libcoap) memory pools are defined in examples/lwip/config/lwippools.h . I guess you need to check which lwippools.h is getting included.

mrdeep1 avatar Jul 21 '23 13:07 mrdeep1

my mem_pools[] is like: image

Index 19 is supposed to be for COAP_CONTEXT.

fun-works avatar Jul 21 '23 13:07 fun-works

Interesting - I would be expecting the addresses to be sequentially incrementing, so it looks like it is not picking up LWIP_MEMPOOL(COAP_CONTEXT, MEMP_NUM_COAPCONTEXT, sizeof(coap_context_t), "COAP_CONTEXT") from your final lwippools.h.

I get (under linux) from the libcoap built client executable

(gdb) p memp_pools
$1 = {0x4412e0, 0x441320, 0x441360, 0x4413a0, 0x4413e0, 0x441420, 0x441460, 0x4414a0, 0x4414e0, 0x441520, 0x441560, 0x4415a0, 
  0x4415e0, 0x441620, 0x441660, 0x4416a0, 0x4416e0, 0x441720, 0x441760, 0x4417a0}
(gdb) p memp_COAP_CONTEXT
$5 = {desc = 0x441548 "COAP_CONTEXT", stats = 0x64e930, size = 1056, num = 1, base = 0x658e40 "", tab = 0x64e948}
(gdb) p/x &memp_PBUF
$1 = 0x4414e0
(gdb) p/x &memp_PBUF_POOL
$2 = 0x441520
(gdb) p/x &memp_COAP_CONTEXT
$7 = 0x441560

mrdeep1 avatar Jul 21 '23 14:07 mrdeep1

Yes, may be I have some pool configuration issues. I will check this on my side. Thanks

fun-works avatar Jul 21 '23 17:07 fun-works

Ok, I checked on my code and I had an empty lwippools.h empty and I corrected it to use the one from libcoap. Now it is coming like: image

However, as you can see the *tab is 0. and causing the allocation to fail. Am I missing anything further ? Could you please help here ?

fun-works avatar Jul 24 '23 05:07 fun-works

I think I am trying to initialize libcoap before lwip. Let me fix this first.

fun-works avatar Jul 24 '23 06:07 fun-works

Yes, lwip_init() should be called before coap_startup() before coap_new_context().

mrdeep1 avatar Jul 24 '23 07:07 mrdeep1

Ok, I fixed it now. However I am not able to respond. I mean I can receive a GET request. But it does not respond anything. My implementation for libcoap 4.3.0 has not changed. It works with Linux though. Also I can only receive a request only twice and no request is received after that. I can ping the device, means it has not crashed. Any hints on this ?

Following is my response PDU formed: image

fun-works avatar Jul 24 '23 10:07 fun-works

Update: If I comment lock and unlock in udp send as below, I can send out the responses: image

I think this is a nested lock getting locked here which is already locked. I am analyzing the same.

But still I can only receive two messages ?

fun-works avatar Jul 24 '23 12:07 fun-works

It looks like LOCK_TCPIP_CORE() was invoked in coap_io_process(), which then timed out and called coap_io_process_timeout(), which then called coap_io_prepare_io(), which then tried to send out an unsolicited observe response, or async delayed response which called coap_socket_send().

This could have happened after 2 responses. The whole locking up of TCPIP_CORE needs to be reviewed.

mrdeep1 avatar Jul 24 '23 12:07 mrdeep1

Interesting, because I am processing them in piggyback way and there should not be a delay in responding.

I tried to debug the 2 request reception issue, I found coap_io_lwip.c:245 line: session = coap_endpoint_get_session(ep, packet, now);

returning NULL, we have coap sessions count as 2. Am I supposed to free the coap session anywhere or is it missing in libcoap anywhere ?

fun-works avatar Jul 24 '23 13:07 fun-works

The example lwip-server code has coap_context_set_max_idle_sessions(main_coap_context, MEMP_NUM_COAPSESSION -1); included to force idle sessions to be cleaned up in coap_endpoint_get_session(), which leaves space for one new incoming session.

An idle session is defined as the reference count == 0 and there is nothing to be sent in the delay queue.

mrdeep1 avatar Jul 24 '23 13:07 mrdeep1

Are you able to get this to work now?

mrdeep1 avatar Sep 18 '23 08:09 mrdeep1

Yes, except the following two stuffs:

  1. I still have the local patch for LOCK_TCPIP_CORE()
  2. Device is not able to do a multicast response. It is getting delayed by libcoap as I could see by debugging, but no idea where it is going after that. I need to debug further to get into details.

rpati12 avatar Sep 22 '23 04:09 rpati12

The multicast response is deliberately delayed as per RFC7252 Section 8.2 Request/Response Layer, which uses the async logic - hence the LOCK_TCPIP_CORE() fix you are doing in coap_socket_send().

I will have a look at this.

mrdeep1 avatar Sep 22 '23 09:09 mrdeep1

After getting delayed, this response never comes out. I will debug further on this.

The Lock thing, yes, I am currently commented out the Lock/Unlock in the socket_send() call.

But, we can close this issue for now. If I find any issue in multicast send, I will create another issue. Makes sense ?

rpati12 avatar Sep 22 '23 09:09 rpati12

I'm not sure how you are setting up multicast support in the server as the standard code does not have support for this for LWIP. It would be good to understand what changes you are making here,

mrdeep1 avatar Sep 22 '23 10:09 mrdeep1

But the multicast sorting out should be on a separate Issue.

mrdeep1 avatar Sep 22 '23 10:09 mrdeep1

yes

rpati12 avatar Sep 22 '23 10:09 rpati12