ardupilot icon indicating copy to clipboard operation
ardupilot copied to clipboard

AP_DDS: add best effort streams and remove unused TFMessage

Open srmainwaring opened this issue 2 years ago • 5 comments

Note: this PR has been reworked to incorporate https://github.com/ArduPilot/ardupilot/pull/25228

  • Remove the unused TFMessage variable tx_dynamic_transforms_topic.
  • Add best effort streams - should not mix reliable and best effort stream types.
  • Add dependency on reconnect PR which also prevents a false status returned from uxr_run_session_time stopping publishing.

Dependencies

  • https://github.com/ArduPilot/ardupilot/pull/25228
  • https://github.com/ArduPilot/ardupilot/pull/25231

Details

  • Remove unused message variables. The TFMessage consumes a lot of RAM because of its internal array[100]. This causes allocation failure on some hardware (esp32) because of dram limits.
  • Use the TopicIndex enum rather than integer indices for publisher topics. This reduces the risk of an incorrect index and is consistent with the treatment of subscribers and services (Note: moved to https://github.com/ArduPilot/ardupilot/pull/25231)
  • Don't check the agent status flag when writing topics. Doing so causes comms to stop as the failure of a single message to elicit an ack reply from the agent prevents all further messages from being sent (Note: moved to https://github.com/ArduPilot/ardupilot/pull/25228).

Testing

Hardware

  • FCU: Matek H743-WING
  • UART2 => SERIAL3: Matek M8Q-CAN GPS
  • UART3 => SERIAL4: CP2012 TTL to USB serial adapter
  • UART6 => SERIAL7: FrSky RX6R
  • UART7 => SERIAL1: Generic SiK Radio

Params

SERIAL0_BAUD     115         # 115200
SERIAL0_PROTOCOL 2           # MAVLink2
SERIAL1_BAUD     57          # 57600
SERIAL1_PROTOCOL 2           # MAVLink2
SERIAL2_BAUD     57          # 57600
SERIAL2_PROTOCOL 2           # MAVLink2
SERIAL3_BAUD     115         # 115200
SERIAL3_PROTOCOL 32          # MSP
SERIAL4_BAUD     115         # 115200
SERIAL4_OPTIONS  0           # 
SERIAL4_PROTOCOL 45

GPS_TYPE         19          # MSP
BARO_PRIMARY     1           # 2ndBaro
BARO_PROBE_EXT   4096        # MSP
COMPASS_TYPEMASK 0 

Configure, build and upload

./waf configure --board MatekH743 --disable-scripting --enable-dds
./waf rover --upload

Run

The ground control station is run on macOS. Port details will be different for Ubuntu.

MAVProxy:

mavproxy.py --master /dev/cu.usbmodem142101 --console --moddebug 3

micro-ROS agent:

./build/micro_ros_agent/micro_ros_agent serial -D /dev/cu.usbserial-0001 -b 115200 -v6 -r $HOME/Code/ardupilot/ardupilot/libraries/AP_DDS/dds_xrce_profile.xml

Checks:

% ros2 node info /Ardupilot_DDS_XRCE_Client
/Ardupilot_DDS_XRCE_Client
  Subscribers:
    /ap/cmd_vel: geometry_msgs/msg/TwistStamped
    /ap/joy: sensor_msgs/msg/Joy
    /tf: tf2_msgs/msg/TFMessage
  Publishers:
    /ap/battery/battery0: sensor_msgs/msg/BatteryState
    /ap/clock: rosgraph_msgs/msg/Clock
    /ap/geopose/filtered: geographic_msgs/msg/GeoPoseStamped
    /ap/navsat/navsat0: sensor_msgs/msg/NavSatFix
    /ap/pose/filtered: geometry_msgs/msg/PoseStamped
    /ap/tf_static: tf2_msgs/msg/TFMessage
    /ap/time: builtin_interfaces/msg/Time
    /ap/twist/filtered: geometry_msgs/msg/TwistStamped
  Service Servers:
    /ap/arm_motors: ardupilot_msgs/srv/ArmMotors
  Service Clients:

  Action Servers:

  Action Clients:

Figure: initialisation and arming - arming fails as GPS does not have a fix when bench testing. dds-matek-h743

srmainwaring avatar Sep 07 '23 11:09 srmainwaring

We had mentioned putting together a python script to deploy this in hardware and then run an integration test. Would it be possible to put something together like that so we can continue to quickly test PR's on hardware?

Ryanf55 avatar Sep 07 '23 14:09 Ryanf55

We had mentioned putting together a python script to deploy this in hardware and then run an integration test. Would it be possible to put something together like that so we can continue to quickly test PR's on hardware?

Not sure how this would be (or should be) different for testing other aspects of AP on hardware. The problem is that most hardware tends to be custom (devices, port choices, param mappings) and I can't see an easy way to do this without replicating effort put in elsewhere (configurator). Maybe better would be to use the configurator, or do the work to get that running and include DDS as an option?

srmainwaring avatar Sep 07 '23 15:09 srmainwaring

@Ryanf55 - am thinking of cherry-picking the first three commits of this PR into a separate PR, as they don't change behaviour but complete some outstanding housekeeping on the library. Thoughts?

Edit: updated PR to depend on support for automatic reconnect which also addresses the uxr_run_session_time status check causing the publisher update to stop. Details above.

srmainwaring avatar Sep 29 '23 20:09 srmainwaring

Issues:

Issues when running on hardware (MatekH743)

  1. The micro-ROS agent will sometimes crash with:
libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument
zsh: abort      ./build/micro_ros_agent/micro_ros_agent serial -D /dev/cu.usbserial-0001 -b  

May happen when:

  • Restarting agent and client while ros2 subscribers are still active.
  • Kill a service call that is hanging (no response received).
  1. The best effort streams are working well, but there is an issue with the reliable streams. They stall after some time.
  • The services fail to get a response.
  • The reliable publishers freeze.

srmainwaring avatar Oct 11 '23 14:10 srmainwaring

Do you want to rebase and we can try to get this in soon? I now have an approprite hardware test setup with debugger.

Ryanf55 avatar Dec 06 '23 03:12 Ryanf55