iotivity-lite icon indicating copy to clipboard operation
iotivity-lite copied to clipboard

oc_main_poll never returns when onboarded device loses connection

Open lubo-svk opened this issue 3 years ago • 2 comments

Hello, I've found one issue that happens only when the device is onboarded to the cloud and loses network connection. When a connection to the server is lost the iotivity-lite process "TLS Process" defined in iotivity-lite/security/oc_tls.c will hang for a longer period of time (probably timeout issue).

During this time a lot of events will queue up and the stack is not able to process them before the next "TLS Process" kicks again. This leads to a state where oc_main_poll never returns and causes all sort of issue mostly when you have multithreaded environment and you are protecting the access to iotivity-lite API with shared recursive mutexes.

I have managed to fix this issue by placing a limit on how many polls can be invoked during one oc_main_poll but I would like to ask you if there is a better way to solve this issue. Thanks.

My fix is as following:

M api/oc_main.c
@@ -280,7 +280,8 @@ oc_clock_time_t
 oc_main_poll(void)
 {
   oc_clock_time_t ticks_until_next_event = oc_etimer_request_poll();
-  while (oc_process_run()) {
+  int max_processes = 100;
+  while (oc_process_run() && (--max_processes > 0)) {
     ticks_until_next_event = oc_etimer_request_poll();
   }
   return ticks_until_next_event;

lubo-svk avatar Jun 17 '21 13:06 lubo-svk

ping @WAvdBeek @kmaloor

ondrejtomcik avatar Jun 21 '21 06:06 ondrejtomcik

do we have data on how many polls we do in a time segment?

The issues looks more like that this is an issue in how multi threaded is set up. I agree that the loop should be breakable for a timer tick in case of multithreading.

WAvdBeek avatar Jun 25 '21 10:06 WAvdBeek