ardupilot icon indicating copy to clipboard operation
ardupilot copied to clipboard

Dds prototype

Open arshPratap opened this issue 4 years ago • 9 comments

Hello Ardupilot community! This PR is work in progress and consists of the prototype code that aims to provide DDS functionality to Ardupilot for effective communication with ROS2 and Native DDS implementations

Added 2 IDL files

  • A simple Num idl to test the xrce dds functionality
  • AP_INS idl file for sending INS info

Added the eProsima XRCE Client lib as submodule Provides Ardupilot an XRCE client to communicate with DDS/ROS2 through : Serial(Main focus) UDP(Only for testing,to be removed in final version)

Current Issues

  • [x] Have tested the code through serial communication and sometimes it is found that data is not properly flushed out in the serial ports resulting in communication to XRCE agent being cut.Working on implementing a function to take care of this issue.
  • [x] Have doubts related to setting up the build system to install the XRCE Client from the necessary submodules

Thanks !

arshPratap avatar Jun 15 '21 15:06 arshPratap

On Wed, 16 Jun 2021, Pierre Kancir wrote:

You should see this as the mavlink xml, it is the definition to generate the AP_INS.cpp code that is auto generated

We don't commit the generated code in ArduPilot :-)

peterbarker avatar Jun 18 '21 01:06 peterbarker

Adding to the conversation so we can keep track and for easier discussion. The pr in its current form add the submodule to the project, however it still requires manual installation of the libraries and its looking to the system folders for those libraries. These are the instructions for building the XRCE client: https://micro-xrce-dds.docs.eprosima.com/en/latest/installation.html

This needs to be added to the waf build system.

jmachuca77 avatar Jun 23 '21 00:06 jmachuca77

it looks like we will need to do config.h.in processing

tridge avatar Jul 16 '21 03:07 tridge

@arshPratap I've added the waf rules for config.h generation here: https://github.com/tridge/ardupilot/commits/ddsPrototype

tridge avatar Jul 16 '21 06:07 tridge

This is a very good work, i am wondering why you opted out to use XRCE client directly and not use https://github.com/micro-ROS/micro_ros_setup for example like https://github.com/micro-ROS/micro_ros_arduino

amfern avatar Sep 06 '21 19:09 amfern

Hello, is there any progress regarding implementing direct DDS ROS2 communication with ardupilot? Thank you.

eMrazSVK avatar Sep 16 '22 10:09 eMrazSVK

Hello, is there any progress regarding implementing direct DDS ROS2 communication with ardupilot? Thank you.

Hi @eMrazSVK ! I am so sorry I wasnt able to work on this in the recent time due to my current predicaments, but I would love to resume working on it and collaborating if you are intereseted! You can mail me : [email protected] for further information.In the mean time I would love if you could take a look here : https://github.com/arshPratap/ardupilot_ros2 and https://discuss.ardupilot.org/t/ardupilot-native-ros2-support/73720

arshPratap avatar Sep 16 '22 11:09 arshPratap

Thank you for your quick response @arshPratap.

If I decided to collaborate on this, could you give me just a really rough estimate of how much time would be needed to fully implement this? Are we speaking weeks, months or years?

Thank you once again.

eMrazSVK avatar Sep 16 '22 11:09 eMrazSVK

Thank you for your quick response @arshPratap.

If I decided to collaborate on this, could you give me just a really rough estimate of how much time would be needed to fully implement this? Are we speaking weeks, months or years?

Thank you once again.

I am not sure on the time frame as I believe there are few things currently in this project that I believe need to be taken care of first :

  • Working on the some of the dependencies to make sure a smooth installation of the project could be done

But I believe that this PR has been idle for a long time and I would love to see this merged as soon as possible. Also the fact that since you are interested to collaborate on this , makes me think we can work on this faster and i think we can have a month's time as a nice deadline on this. @eMrazSVK Let me know what you think...and lets start the work immediately

arshPratap avatar Sep 16 '22 11:09 arshPratap

What is outstanding in this to get it merged?

russkel avatar Nov 21 '22 06:11 russkel

I have attempted to merge in latest ardupilot and get it running with latest versions of MicroCDR/XRCEClient and failed to compile due to lack of configuration:

Processing /home/russ/work/ardupilot/modules/Micro-XRCE-DDS-Client/include/uxr/client/config.h.in to /home/russ/work/ardupilot/build/fmuv5/modules/Micro-XRCE-DDS-Client/include/uxr/client/config.h
Missing config elements:  ['UCLIENT_SHARED_MEMORY_MAX_ENTITIES', 'UCLIENT_SHARED_MEMORY_STATIC_MEM_SIZE', 'UCLIENT_HARD_LIVELINESS_CHECK_TIMEOUT']

Using the old versions of the module and some patches to the AP code I was able to get it compiling but it doesn't link, presumably because of no POSIX clock funcs on the fmuv5 codebase.

/opt/gcc-arm-none-eabi-10-2020-q4-major/bin/../lib/gcc/arm-none-eabi/10.2.1/../../../../arm-none-eabi/bin/ld: lib/libRover_libs.a(time.c.0.o): in function `uxr_millis':
time.c:(.text.uxr_millis+0x8): undefined reference to `clock_gettime'
/opt/gcc-arm-none-eabi-10-2020-q4-major/bin/../lib/gcc/arm-none-eabi/10.2.1/../../../../arm-none-eabi/bin/ld: lib/libRover_libs.a(time.c.0.o): in function `uxr_nanos':
time.c:(.text.uxr_nanos+0x8): undefined reference to `clock_gettime'
collect2: error: ld returned 1 exit status

There seems to also be an issue with this header file generation being run each waf command?

↳ ./waf clean
building /home/russ/work/ardupilot/build/fmuv5/modules/Micro-XRCE-DDS-Client/include/uxr/client/config.h
building /home/russ/work/ardupilot/build/fmuv5/modules/Micro-CDR/include/ucdr/config.h
'clean' finished successfully (0.875s)

I don't have a whole lot of time to work on this at the moment but my branch is here: https://github.com/Greenroom-Robotics/ardupilot/tree/dds-rebase

russkel avatar Nov 22 '22 04:11 russkel

I have attempted to merge in latest ardupilot and get it running with latest versions of MicroCDR/XRCEClient and failed to compile due to lack of configuration:

Processing /home/russ/work/ardupilot/modules/Micro-XRCE-DDS-Client/include/uxr/client/config.h.in to /home/russ/work/ardupilot/build/fmuv5/modules/Micro-XRCE-DDS-Client/include/uxr/client/config.h
Missing config elements:  ['UCLIENT_SHARED_MEMORY_MAX_ENTITIES', 'UCLIENT_SHARED_MEMORY_STATIC_MEM_SIZE', 'UCLIENT_HARD_LIVELINESS_CHECK_TIMEOUT']

Using the old versions of the module and some patches to the AP code I was able to get it compiling but it doesn't link, presumably because of no POSIX clock funcs on the fmuv5 codebase.

/opt/gcc-arm-none-eabi-10-2020-q4-major/bin/../lib/gcc/arm-none-eabi/10.2.1/../../../../arm-none-eabi/bin/ld: lib/libRover_libs.a(time.c.0.o): in function `uxr_millis':
time.c:(.text.uxr_millis+0x8): undefined reference to `clock_gettime'
/opt/gcc-arm-none-eabi-10-2020-q4-major/bin/../lib/gcc/arm-none-eabi/10.2.1/../../../../arm-none-eabi/bin/ld: lib/libRover_libs.a(time.c.0.o): in function `uxr_nanos':
time.c:(.text.uxr_nanos+0x8): undefined reference to `clock_gettime'
collect2: error: ld returned 1 exit status

There seems to also be an issue with this header file generation being run each waf command?

↳ ./waf clean
building /home/russ/work/ardupilot/build/fmuv5/modules/Micro-XRCE-DDS-Client/include/uxr/client/config.h
building /home/russ/work/ardupilot/build/fmuv5/modules/Micro-CDR/include/ucdr/config.h
'clean' finished successfully (0.875s)

I don't have a whole lot of time to work on this at the moment but my branch is here: https://github.com/Greenroom-Robotics/ardupilot/tree/dds-rebase

Hey there, sweet work. I'm looking in to see how far along this is. If I have time this week, I'll play with fixing this issues above. Brand new to ardupilot, so I'll need some time to get up to speed.

For anyone like me who doesn't know much about waf, there was a configuration added to enable the compilation that needs to be set:

./waf configure --enable-xrce-dds

Ryanf55 avatar Feb 19 '23 23:02 Ryanf55

@Ryanf55 due to time constraints I wasn't able to work on this further and had to revert to using PX4 which has microROS support in the trunk version. To be honest I would prefer to use ardupilot but don't have time to understand the Ardupilot codebase.

russkel avatar Feb 20 '23 02:02 russkel

I've been able to take the changes from @russkel and remove the Missing config elements errors. I also updated to latest master as of this afternoon. Here's my branch. There were no conflicts in the merge from master.

That said, I now get a cryptic KeyError.

[  79/5764] Compiling modules/Micro-XRCE-DDS-Client/src/c/core/session/common_create_entities.c
Waf: Leaving directory `/home/ryan/Documents/ardu_ws/ardupilot/build/sitl'
Build failed
Traceback (most recent call last):
  File "/home/ryan/Documents/ardu_ws/ardupilot/modules/waf/waflib/Task.py", line 358, in process
    self.post_run()
  File "/home/ryan/Documents/ardu_ws/ardupilot/modules/waf/waflib/extras/gccdeps.py", line 161, in post_run
    Task.Task.post_run(self)
  File "/home/ryan/Documents/ardu_ws/ardupilot/modules/waf/waflib/Task.py", line 722, in post_run
    bld.task_sigs[self.uid()] = self.signature()
  File "/home/ryan/Documents/ardu_ws/ardupilot/Tools/ardupilotwaf/ap_persistent.py", line 32, in _signature
    s = _original_signature(self)
  File "/home/ryan/Documents/ardu_ws/ardupilot/modules/waf/waflib/Task.py", line 651, in signature
    self.sig_implicit_deps()
  File "/home/ryan/Documents/ardu_ws/ardupilot/modules/waf/waflib/extras/gccdeps.py", line 167, in sig_implicit_deps
    return Task.Task.sig_implicit_deps(self)
  File "/home/ryan/Documents/ardu_ws/ardupilot/modules/waf/waflib/Task.py", line 843, in sig_implicit_deps
    del bld.imp_sigs[key]
KeyError: b'\xcd.....x1e,(\x985\xa3|'

I've changed the key in case it is uniquely identifying me in any way.

Since it's gcc in the name, here's some info.

ryan@ryan-m93p:~/Documents/ardu_ws/ardupilot$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ryan@ryan-m93p:~/Documents/ardu_ws/ardupilot$ uname -a
Linux ryan-m93p 5.15.0-60-generic #66-Ubuntu SMP Fri Jan 20 14:29:49 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

This only occurs on this branch with the XRCE DDS option enabled. On master, no issues. Yes, I already tried clearing the build folder out and trying again. It happens on first build. If I then just try to build again, the build passes.

Any suggestions?

Ryanf55 avatar Feb 20 '23 02:02 Ryanf55

@Ryanf55 Thanks for this! Its strange..I tried setting up the project on a completely new environment and its strange that I didn't experience the same issues that you and @russkel have experienced. Nevertheless I will be trying to set it up on a new environment.Could you give me few more details regarding your current project environment? Thanks for sharing.(I have added a new image of current setup..where the vehicle is successfully able to connect with an XRCE Agent Screenshot from 2023-02-20 22-15-01

arshPratap avatar Feb 20 '23 16:02 arshPratap

Great to hear. My environment: Ubuntu 22.04 for using humble. I have installed humble through apt. I have updated my branch to use the latest master branch ardupilot, and any of the DDS stuff is also on humble or master as appropriate.

I was hoping to be able to complete the entire development on Linux using SITL; I have a pixhawk 1 available for test on, however am struggling to flash it. I think I need to configure a pseudo serial port. Are you using a physical device and communicating with it over serial USB? Seems like it with /dev/ttyUSB0

Ryanf55 avatar Feb 20 '23 17:02 Ryanf55

I'd like to see this re-worked to have either no topics in C++, or only a single health topic, then implement the individual topics in lua scripts

tridge avatar Feb 20 '23 23:02 tridge

I'd like to see this re-worked to have either no topics in C++, or only a single health topic, then implement the individual topics in lua scripts

Will do. Once @arshPratap and I can get on the same page and replicate current working behavior, I'll work on stripping it down to the health topic.

Would it be useful to go take a look at PX4, since they have DDS working, and see how they dealt with the flash issue? That seems to be the prime concern.

Ryanf55 avatar Feb 21 '23 01:02 Ryanf55

Ok, I have just tried an alternative method using the Dockerfile at the root of the repo, and there were no compilation issues. That said I did notice that, even during a waf clean, there is still configuration of the uxr client config like so:

ardupilot@f0cba57f4262:/ardupilot$ ./waf clean
building /ardupilot/build/sitl/modules/Micro-XRCE-DDS-Client/include/uxr/client/config.h
building /ardupilot/build/sitl/modules/Micro-CDR/include/ucdr/config.h
'clean' finished successfully (0.654s)

Is there any reason to be concerned? Waf's new to me :dancers:

Ryanf55 avatar Feb 21 '23 04:02 Ryanf55

We've updated everything and successfully have it running on the latest ardupilot, ubuntu 22.04, and ROS2 humble, latest XRCE libraries too aside from XTypes. I've updated the README with everything on my branch if anyone is keen to give it a go.

I was going to start work on stripping out all the extra messages, however we are unable to run the MicroXRCE Code generator due to the following issue I have filed. Hopefully EProsima can lend us a hand.
https://github.com/eProsima/Micro-XRCE-DDS-Gen/issues/44

Ryanf55 avatar Feb 24 '23 01:02 Ryanf55

Great ! Since we were able to successfully update the project, we can now move onto working on improving the string handling stuff (via Lua scripting as was suggested in the dev call ) and if possible we can do a test on using microROS apis instead of the xrce client apis.

arshPratap avatar Feb 24 '23 01:02 arshPratap

I'd like to see this re-worked to have either no topics in C++, or only a single health topic, then implement the individual topics in lua scripts

I implemented a standard ROS topic ( builtin_interfaces/msg/Time ) showing how you can use native ROS topics; it's populated by the clock.

I don't know if heartbeat is even necessary. According to OMG DDS-XRCE Standard v1.0 section 8.3.5.12, it may not be required.

The HEARTBEAT submessage is used to enable a transport independent reliability protocol to be implemented.
This specification does not limit a session to use a particular type of transport. If a session transport is able to reliably
send messages in case of disconnection or a wakeup/sleep cycle then these messages may not be required.

Depending on if the user wants to use DDS for critical applications or not could drive using this.

The README is up to date if you want to see the demo. There's also an attempt to do BatteryState, but serialization keeps failing due to my inability to understand all the magic numbers in the serializer.

I have some questions for some of he Ardupilot maintainers how we want to handle the XML profiles and what specifically we should move off into Lua to accomplish the goal of reducing flash size and string manipulation. Now that I cleaned up the code removing all the extra messages, it should be a little more easy to reason about what is message-specific code and what is generic. Would either @tridge, or any of the other devs be able to discuss in the next few days?

Architecture to discuss:

  • [x] Do we need heartbeat
  • [x] Do we need to use XML profiles if there are other ways of configuring the client that consume less memory?
    • No, see 07bc19571d62c00f28d055879372f3267c3fe6be. I removed most of the XML from the device.
  • [x] Since we would expect multiple topics simultaneously just like mavlink, and potentially multiple topics for each message type, what should the client/participant/topic architecture be?
    • For MVP, one client, three topics, one participant, and it's set up at compile time.
  • [x] If a user wants to adjust the rate of data being published from Ardupilot to offboard, should it be similar to the mavlink that bins to the nearest clock?
    • Yes, there are a few different ways. For MVP, hard code the rates. We will cover parameter integration when we decide if we can use the uROS client libraries.
  • [x] Can we make use of something like the ROS2 parameter API to configure publish rates in a method familair to ROS2 users?
    • Yes, exposing all the AP params over DDS through a ROS2 param compatible interface is a highly desirable feature.
  • [x] Would implementing this over CAN or UDP/TCP use less CPU than on serial?
    • For MVP, only serial, however it should work over UDP ethernet with multicast in the future to support Pixhawk 6X and swarming applications.
  • [x] I want to investigate using the uRos client libraries, which are not currently here. Key reason is support for the ROS parameter API, and simplified client logic. Any reason not to?
    • Just flash size. In the interest of time, focus MVP on what we have, and we can add it in as a second option later if it provides clear advantage.

Cheers!

Ryanf55 avatar Feb 24 '23 15:02 Ryanf55

another rebase with build fixes for HAL_ChibiOS: https://github.com/tridge/ardupilot/tree/dds-rebase

tridge avatar Feb 26 '23 05:02 tridge

i've pushed over the PR branch

tridge avatar Feb 26 '23 05:02 tridge

Just updated to master, working on SITL just fine. I tried flashing my Pixhawk 6X for a (first?) hardware test. With MicroROS Agent on or off, no change in behavior below.

In the console logs, it shows a critical failure. The "Got 60" call is added debug.

Critical failure 0x800 sysid=1
disabling flow control on serial 3
disabling flow control on serial 2
Mode RTL
AP: RC Long Failsafe On: RTL
Got 47
Got 56
Got 60
link 1 down
no link
q9link 1 OK
Y\XXlink 1 down
no link

For the first time, I tried running on Pixhawk6X. It hangs on XRCE_Client::init(), specifically on the call to uxr_create_session(&session).

Just before that, in the call to uxr_init_session, there is a lock with UXR_INIT_LOCK(&session->mutex);, unclear how it's cleared.

Also, in the session API, there is a status callback uxr_set_status_callback that isn't being used. I'll add that, perhaps it will provide debugging.

Currently, as far as session creation attempts, we have it set to 2 in gen_config.py

    'UCLIENT_MAX_SESSION_CONNECTION_ATTEMPTS' : 2,

It's not clear why it's hanging (or crashing). When I add in the status message, I get no response. image

Digging further in, the default UCLIENT_MIN_SESSION_CONNECTION_INTERVAL is 1000, we set it to 2, which seems far too low to wait for a connection.

I did find the blocking loop in wait_session_status

do
{
   listen_message(session, remaining_time);
   remaining_time = UXR_CONFIG_MIN_SESSION_CONNECTION_INTERVAL - (int)(uxr_millis() - start_timestamp);
} while (remaining_time > 0 && session->info.last_requested_status == UXR_STATUS_NONE);
UXR_CONFIG_MIN_SESSION_CONNECTION_INTERVAL = 2
uxr_millis = 10100 if I had to guess, not having gdb. We delay 10 seconds to boot. 
start_time_stamp = 10050 or so.

So, we are likely getting negative time here with remaining_time being an int.

A related PR shows the recent change and adds a test on how this is used here.

Adding in a printf in just the right spot in wait_session_status results in successful initialization.

bool wait_session_status(
        uxrSession* session,
        uint8_t* buffer,
        size_t length,
        size_t attempts)
{
    session->info.last_requested_status = UXR_STATUS_NONE;
    printf("wait_session_status() %u\n", __LINE__ );

image

With 1000mS initialization, it's flaky, sometimes working, sometimes not. The heartbeat was also ungodly low, so I bumped that up. image

EDIT: Resolution was interraction of uxr_millis() returning 0 during initialiaztion in Ardupilot since RTC had no GPS time yet on hardware, AND the eprosima client library had underflow in a while loop which made it block forever. I've filed an upstream bug here: https://github.com/eProsima/Micro-XRCE-DDS-Client/issues/350

Ryanf55 avatar Mar 04 '23 01:03 Ryanf55

I've filed an upstream bug here: eProsima/Micro-XRCE-DDS-Client#350

I've pushed a PR to at least protect when building and running in debug mode with some asserts. That protects one of many similar blocks of code there in Debug mode to make it easier to find timing bugs, but offers no runtime protection.

As the source of time for that library, we need to ensure our value of time returned by uxr_millis() is always increasing. From the current usage of that function in their code, the only apparent behavior is timeouts in blocking do/while calls; it has nothing to do with time sync where we need to care about clock drift.

I have a recommendation that we do the following:

  • Use a time source that always increases (boot time)
  • Avoid implementing any function that goes backwards in time
  • Avoid switching time sources internally in uxr_millis
  • DO NOT use RTC time which may not be available on boot.

@tridge Is that reasonable to meet in Ardupilot for this PR? It will avoid having to patch all 30 uses of uxr_millis in the eProsima's Micro XRCE Client library to handle a clock implementation that doesn't progress forwards in time.

Ryanf55 avatar Mar 04 '23 23:03 Ryanf55

This looks great. A minor suggestion would to bring in the geographic_msgs message set as this has GeoPoints which are likely to be used for waypoints, coordinates, etc etc.

russkel avatar Mar 09 '23 05:03 russkel

Will do, added to the radar. Maybe not this PR, but soon.

Ryanf55 avatar Mar 09 '23 07:03 Ryanf55

@russkel Thanks for the help along the way. I just rewrote the history in prep for final merge. If you want to be listed as a co-author on this merge, please share your name and email you'd like to use, and I'll add you on as a Co-Author.

Ryanf55 avatar Mar 12 '23 23:03 Ryanf55

@russkel Thanks for the help along the way. I just rewrote the history in prep for final merge. If you want to be listed as a co-author on this merge, please share your name and email you'd like to use, and I'll add you on as a Co-Author.

Russ Webber - [email protected]

Thanks @Ryanf55

russkel avatar Mar 13 '23 01:03 russkel