micro_ros_arduino icon indicating copy to clipboard operation
micro_ros_arduino copied to clipboard

Workaround for micro-ros_publisher_wifi example hanging after 90 seconds

Open kaiaai opened this issue 1 year ago • 6 comments

Issue template

I'd like to share a workaround for micro-ros_publisher_wifi example hanging after 90 seconds.

  • Hardware description: a ESP32-WROOM-32 board connected to a Docker
  • RTOS: Arduino "ESP32 Dev Module" board, default settings, with Micro-ROS
  • Installation type: microros/micro-ros-agent:humble image, Arduino IDE 1.8.13 (Windows), v2.0.7-humble precompiled Micro-ROS
  • Version or commit hash: humble

Steps to reproduce the issue

  • Edit the sketch below to specify your WiFi SSID, password and IP of your PC that will run a Micro-ROS agent. Use the Arduino IDE (Windows) to compile and upload the sketch to an ESP32 module.
  • Launch the Micro-ROS agent. In my case, I'm using a Windows machine with Docker-for-Windows installed. Here is the Windows shell command to launch the Docker Micro-ROS agent image.
docker run --name micro-ros-agent-humble -it -p 8888:8888/udp microros/micro-ros-agent:humble udp4 -p 8888

Once ESP32 connects to WiFi and Micro-ROS agent, I launch another bash and run

ros2 topic echo int32pub
data: 1201
---
data: 1202
---
data: 1203
---
data: 1204
...

Expected behavior

micro-ros_publisher_wifi example keeps running indefinitely long, until powered down

Actual behavior

After received 90 seconds. ros2 topic echo int32pub hangs, does not output message data anymore.

ros2 topic echo int32pub
...
data: 1796
---
data: 1797
---
data: 1798
---
<OUTPUT STOPS>

Additional information

I've fixed this behavior by having ESP32 ping Micro-ROS agent periodically. Please see the code below.

I have also fixed this by

  • changing rclc_publisher_init_best_effort() to rclc_publisher_init_best_effort()
  • adding a subscriber to ESP32

Either of the fixes work, but the ping one is the most convenient for me.

#include <micro_ros_arduino.h>
#include <stdio.h>
#include <rcl/rcl.h>
#include <rcl/error_handling.h>
#include <rclc/rclc.h>
#include <std_msgs/msg/int32.h>

#if !defined(ESP32) && !defined(TARGET_PORTENTA_H7_M7) && !defined(ARDUINO_NANO_RP2040_CONNECT)
#error This example is only avaible for Arduino Portenta, Arduino Nano RP2040 Connect and ESP32 Dev module
#endif

rcl_publisher_t publisher;
std_msgs__msg__Int32 msg;
rclc_support_t support;
rcl_allocator_t allocator;
rcl_node_t node;

#define LED_PIN 2
#define RCCHECK(fn) { rcl_ret_t temp_rc = fn; if((temp_rc != RCL_RET_OK)){error_loop();}}
#define RCSOFTCHECK(fn) { rcl_ret_t temp_rc = fn; if((temp_rc != RCL_RET_OK)){}}

void error_loop(){
  Serial.println(F("error_loop()"));
  while(1){
    digitalWrite(LED_PIN, !digitalRead(LED_PIN));
    delay(100);
  }
}

void setup() {
  Serial.begin(115200);
  set_microros_wifi_transports((char*)"WIFI_SSID", (char*)"WIFI_PASSWORD",
    (char*)"192.168.1.UROS_AGENT_IP", 8888);

  pinMode(LED_PIN, OUTPUT);
  digitalWrite(LED_PIN, HIGH);

  delay(2000);

  allocator = rcl_get_default_allocator();
 
  RCCHECK(rclc_support_init(&support, 0, NULL, &allocator));
  RCCHECK(rclc_node_init_default(&node, "uros_arduino", "", &support));
  RCCHECK(rclc_publisher_init_best_effort(
    &publisher,
    &node,
    ROSIDL_GET_MSG_TYPE_SUPPORT(std_msgs, msg, Int32),
    "int32pub"));

  msg.data = 0;
}

void loop() {
  rcl_ret_t rc = rcl_publish(&publisher, &msg, NULL);
  msg.data++;

  if ((rc == RCL_RET_OK) && (msg.data % 600 == 0)) {
    // pub freezes after 90sec if the line below is commented out
    rmw_uros_ping_agent(1, 1);
  }
    
  delay(50);
}

kaiaai avatar Aug 16 '23 16:08 kaiaai

If you remove the ping and the subscription, and include the publisher inside a timer and spin an executor, does it solve the isuee?

pablogs9 avatar Aug 17 '23 05:08 pablogs9

I have tried publishing inside a timer and spinning the executor in loop() - no, it doesn't resolve the issue. It still freezes exactly the same like the original example - 90 seconds after connecting to the agent.

Here is the timer-and-executor-spin code for your reference:

// Publisher hangs after 90 seconds

#include <micro_ros_arduino.h>
#include <rcl/rcl.h>
#include <rcl/error_handling.h>
#include <rclc/rclc.h>
#include <std_msgs/msg/int32.h>
#include <rclc/executor.h>

#if !defined(ESP32) && !defined(TARGET_PORTENTA_H7_M7) && !defined(ARDUINO_NANO_RP2040_CONNECT)
#error This example is only avaible for Arduino Portenta, Arduino Nano RP2040 Connect and ESP32 Dev module
#endif

rcl_publisher_t publisher;
std_msgs__msg__Int32 msg;
rclc_support_t support;
rcl_allocator_t allocator;
rcl_node_t node;
rcl_timer_t timer;
rclc_executor_t executor;

#define LED_PIN 2
#define RCCHECK(fn) { rcl_ret_t temp_rc = fn; if((temp_rc != RCL_RET_OK)){error_loop();}}
#define RCSOFTCHECK(fn) { rcl_ret_t temp_rc = fn; if((temp_rc != RCL_RET_OK)){}}


void error_loop(){
  Serial.println(F("error_loop()"));
  while(1){
    digitalWrite(LED_PIN, !digitalRead(LED_PIN));
    delay(100);
  }
}

void timer_callback(rcl_timer_t * timer, int64_t last_call_time)
{
  RCLC_UNUSED(last_call_time);
  if (timer != NULL) {
    RCSOFTCHECK(rcl_publish(&publisher, &msg, NULL));
    msg.data++;
  }
}
void setup() {
  Serial.begin(115200);
  set_microros_wifi_transports((char*)"WIFI_SSID", (char*)"WIFI_PASSWORD",
    (char*)"192.168.1.UROS_AGENT_IP", 8888);

  pinMode(LED_PIN, OUTPUT);
  digitalWrite(LED_PIN, HIGH);

  delay(2000);

  allocator = rcl_get_default_allocator();
 
  RCCHECK(rclc_support_init(&support, 0, NULL, &allocator));
  RCCHECK(rclc_node_init_default(&node, "uros_arduino", "", &support));
  RCCHECK(rclc_publisher_init_best_effort(
    &publisher,
    &node,
    ROSIDL_GET_MSG_TYPE_SUPPORT(std_msgs, msg, Int32),
    "int32pub"));

  // create timer,
  const unsigned int timer_timeout = 50;
  RCCHECK(rclc_timer_init_default(
    &timer,
    &support,
    RCL_MS_TO_NS(timer_timeout),
    timer_callback));

  // create executor
  RCCHECK(rclc_executor_init(&executor, &support.context, 1, &allocator));
  RCCHECK(rclc_executor_add_timer(&executor, &timer));
  
  msg.data = 0;
}

void loop() {
  delay(50);
  RCSOFTCHECK(rclc_executor_spin_some(&executor, RCL_MS_TO_NS(100)));
}

The ros2 topic echo int32pub output is same:

PS C:\Users\test> docker exec -it micro-ros-agent-humble bash
root@f93fea7e8390:/uros_ws# ros2 topic list
/int32pub
/parameter_events
/rosout
root@f93fea7e8390:/uros_ws# ros2 topic echo int32pub
...
data: 1762
---
data: 1763
---
data: 1764
---
data: 1765
---
data: 1766
---
data: 1767
---
data: 1768
---
data: 1769
---
<HANGS, NO MORE OUTPUT>

And here is the Micro-ROS agent output FYI

PS C:\Users\test> docker run --name micro-ros-agent-humble -it --rm -p 8888:8888/udp microros/micro-ros-agent:humble udp4 -p 8888
[1692261265.759104] info     | UDPv4AgentLinux.cpp | init                     | running...             | port: 8888
[1692261265.759779] info     | Root.cpp           | set_verbose_level        | logger setup           | verbose_level: 4
[1692261336.530337] info     | Root.cpp           | create_client            | create                 | client_key: 0x63163AC8, session_id: 0x81
[1692261336.530427] info     | SessionManager.hpp | establish_session        | session established    | client_key: 0x63163AC8, address: 172.17.0.1:43676
[1692261336.546124] info     | ProxyClient.cpp    | create_participant       | participant created    | client_key: 0x63163AC8, participant_id: 0x000(1)
[1692261336.577806] info     | ProxyClient.cpp    | create_topic             | topic created          | client_key: 0x63163AC8, topic_id: 0x000(2), participant_id: 0x000(1)
[1692261336.586472] info     | ProxyClient.cpp    | create_publisher         | publisher created      | client_key: 0x63163AC8, publisher_id: 0x000(3), participant_id: 0x000(1)
[1692261336.594476] info     | ProxyClient.cpp    | create_datawriter        | datawriter created     | client_key: 0x63163AC8, datawriter_id: 0x000(5), publisher_id: 0x000(3)

kaiaai avatar Aug 17 '23 08:08 kaiaai

  • Could you paste the output of the agent with -v6.
  • Also does it works if you use a reliable publisher?

pablogs9 avatar Aug 17 '23 09:08 pablogs9

Simply changing rclc_publisher_init_best_effort() to rclc_publisher_init_default() in the original example fixes the problem. Here is the -v6 output. The agent's output keeps running after ros2 topic echo int32pub hangs. Looks like the agent keeps getting the messages, but stops re-publishing them to /int32pub.

PS C:\Users\test> docker run --name micro-ros-agent-humble -it --rm -p 8888:8888/udp microros/micro-ros-agent:humble udp4 -p 8888 -v6
...
0000: 81 01 7C 13 07 01 08 00 13 8A 00 05 7C 13 00 00
[1692309875.235895] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 7D 13 07 01 08 00 13 8B 00 05 7D 13 00 00
[1692309875.287396] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 7E 13 07 01 08 00 13 8C 00 05 7E 13 00 00
[1692309875.336792] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 7F 13 07 01 08 00 13 8D 00 05 7F 13 00 00
[1692309875.388022] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 80 13 07 01 08 00 13 8E 00 05 80 13 00 00
[1692309875.438430] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 81 13 07 01 08 00 13 8F 00 05 81 13 00 00
[1692309875.492285] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 82 13 07 01 08 00 13 90 00 05 82 13 00 00
[1692309875.540532] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 83 13 07 01 08 00 13 91 00 05 83 13 00 00
[1692309875.591825] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 84 13 07 01 08 00 13 92 00 05 84 13 00 00
[1692309875.641564] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 85 13 07 01 08 00 13 93 00 05 85 13 00 00
[1692309875.693420] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 86 13 07 01 08 00 13 94 00 05 86 13 00 00
[1692309875.743523] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 87 13 07 01 08 00 13 95 00 05 87 13 00 00
[1692309875.794443] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 88 13 07 01 08 00 13 96 00 05 88 13 00 00
[1692309875.849060] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 89 13 07 01 08 00 13 97 00 05 89 13 00 00
[1692309875.895805] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 8A 13 07 01 08 00 13 98 00 05 8A 13 00 00
[1692309875.946143] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 8B 13 07 01 08 00 13 99 00 05 8B 13 00 00
[1692309875.996516] debug    | UDPv4AgentLinux.cpp | recv_message             | [==>> UDP <<==]        | client_key: 0x00000000, len: 16, data:
0000: 81 01 8C 13 07 01 08 00 13 9A 00 05 8C 13 00 00
^C[ros2run]: Interrupt

kaiaai avatar Aug 17 '23 22:08 kaiaai

Could you check if this PR solves this issue: https://github.com/micro-ROS/rmw_microxrcedds/pull/305 ?

pablogs9 avatar Aug 18 '23 07:08 pablogs9

I'm afraid it's not obvious for me what and how to rebuild to get that PR in. I tried searching rmw_microxrcedds package in index.ros.org to see which packages depend on it and have to be rebuilt, but it's not listed. I've tried looked at micro_ros_setup .sh built scripts and it appears to be the right place, but still far from obvious.

kaiaai avatar Aug 19 '23 02:08 kaiaai