PX4-Autopilot icon indicating copy to clipboard operation
PX4-Autopilot copied to clipboard

Zenoh introduce rmw_zenoh_cpp compatiblity

Open PetervdPerk-NXP opened this issue 9 months ago • 12 comments

This PR allows Zenoh to directly interface with ROS2 running rmw_zenoh_cpp https://github.com/ros2/rmw_zenoh

rmw_zenoh_cpp isn't stable of as of yet, so compatibility might break again.

  • Updates zenoh-pico library to 1.3.4
  • Adds rosidl RIHS01 CRC32 hashes for each uORB message, providing type safety and ROS2 can compare it's own px4_msgs with that.
  • Introduce ZENOH_DOMAIN_ID for set matching ROS_DOMAIN_ID on PX4
  • Allow to map uORB instance topic to a custom ROS2 topic thanks to @BenChung

To reproduce Have a Ubuntu 24.04 machine with ros2 kilted Install rmw_zenoh_cpp

sudo apt update && sudo apt install ros-kilted-rmw-zenoh-cpp

Set RMW to rmw_zenoh_cpp

export RMW_IMPLEMENTATION=rmw_zenoh_cpp

Start zenohd using

ros2 run rmw_zenoh_cpp rmw_zenohd

You'll see something like this

2025-03-09T11:27:10.020634Z  INFO ThreadId(02) zenoh::net::runtime::orchestrator: Zenoh can be reached at: tcp/192.168.1.104:7447

On your PX4 build with Zenoh compiled enable zenoh on startup

param set ZENOH_ENABLE 3

Configure network to connect to the machine running zenohd, 192.168.1.104 in this example. Note px4_sitl_zenoh defaults to localhost so no need to change it.

zenoh config net client tcp/192.168.1.104:7447#iface=eth0

Create a custom publisher mapping for example cpuload to the ROS2 /fmu/out/cpuload topic

zenoh config add publisher /fmu/out/cpuload cpuload

To verify your config type zenoh config it should look similar to this

nsh> zenoh config
Network config:
Mode: client
Locator: tcp/192.168.1.104:7447#iface=eth0

Publisher config:
Topic: /fmu/out/cpuload
Type: cpuload

Reboot PX4

Now on your Linux get topic info using

peter@desktop-peter:~/ros2_ws_kilted$ ros2 topic info /fmu/out/cpuload -v
Type: px4_msgs/msg/Cpuload

Publisher count: 1

Node name: px4_aabbcc00000000000000000000000000
Node namespace: /
Topic type: px4_msgs/msg/Cpuload
Topic type hash: RIHS01_f41315fb98baf77228df67b159df4fdd978620d38cd5f38dd883d6fa3817a7af
Endpoint type: PUBLISHER
GID: 52.57.f3.91.ca.da.7e.c3.c9.dd.d7.78.94.b1.25.67
QoS profile:
  Reliability: RELIABLE
  History (Depth): KEEP_LAST (7)
  Durability: VOLATILE
  Lifespan: Infinite
  Deadline: Infinite
  Liveliness: AUTOMATIC
  Liveliness lease duration: Infinite

Subscription count: 0

Subscribe to a topic

ros2 topic echo /fmu/out/cpuload

PetervdPerk-NXP avatar Mar 09 '25 11:03 PetervdPerk-NXP

@PetervdPerk-NXP This is super cool. Thanks for the huge effort.

The tests are failing. There seems to be an issue with dependencies: ModuleNotFoundError: No module named 'lark'. Could you please take a quick look?

mrpollo avatar Mar 11 '25 16:03 mrpollo

@PetervdPerk-NXP This is super cool. Thanks for the huge effort.

The tests are failing. There seems to be an issue with dependencies: ModuleNotFoundError: No module named 'lark'. Could you please take a quick look?

Lark is a dependency that got pulled in to parse the idl files and generate RIHS01 hashes. I've added lark to requirements.txt but I think the CI docker images need to be updated as well.

PetervdPerk-NXP avatar Mar 11 '25 18:03 PetervdPerk-NXP

I have tried to reproduce the instructions but the ZENOH_ENABLE parameter seems to not be set correctly. I am running the px4_sitl_zenoh build using the command make px4_sitl_zenoh gz_x500 (Ubuntu arm64 vm), when I shutdown the process using ctrl+c and restart it the parameter does not result set:

This is the output of param show ZENOH_ENABLE after setting the parameter to 3:

pxh> param set ZENOH_ENABLE 3
pxh> param show ZENOH_ENABLE 3
Symbols: x = used, + = saved, * = unsaved
x + ZENOH_ENABLE [1003,1915] : 3

 1004/1916 parameters used.
pxh>

and this is after restarting the SITL:

pxh> param show ZENOH_ENABLE
Symbols: x = used, + = saved, * = unsaved

 1003/1916 parameters used.

zenoh_status reports the service to not be running:

pxh> zenoh status
INFO  [zenoh] not running
Command 'zenoh' failed, returned -1.
pxh>

and attempting to start zenoh manually results in a segmentation_fault:

pxh> zenoh start
pxh> INFO  [zenoh] Opening session...
Segmentation fault

Tuxliri avatar Mar 23 '25 17:03 Tuxliri

Hi @Tuxliri

Thanks for testing, indeed autostart wasn't enabled for sitl it seemed. I've pushed a new commit where autostart is also included in sitl.

I just ran Zenoh + SITL on my own machine which seems to be working fine

pxh> zenoh start
pxh> INFO  [zenoh] Opening session...
INFO  [zenoh] Starting reading/writing tasks...

Would you be able to share the operating system you're using and which gcc is used to build sitl? When compiling you should see something like this

-- The CXX compiler identification is GNU 11.4.0
-- The C compiler identification is GNU 11.4.0

PetervdPerk-NXP avatar Mar 23 '25 19:03 PetervdPerk-NXP

Drafting for now so I can wait on Kilted code freeze of rmw_zenoh_cpp.

Right now ABI is a bit too unstable to be usable.

PetervdPerk-NXP avatar Apr 11 '25 15:04 PetervdPerk-NXP

I've worked extensively with uxrce and DDS in the past, which has strongly colored my decision to try and avoid using it even if that incurs substantial headache

I didn’t like uxrce and DDS either, so I completely understand where you're coming from. This work is something I’m doing in my own time, definitely not a full-time commitment. It builds on some earlier work where two PX4 flight controllers operated redundantly. Back then, rmw_zenoh had just been announced. This is just an attempt to push that idea a bit further, though it still needs plenty of testing and fixing.

This can probably be solved by unadvertising with orb_unadvertise afterwards when the publisher is getting cleaned up.

Just pushed a fix for that issue

For what it's worth, uxrce is using orb_advertise-non-multi for all topics; it does support advertise_multi but doesn't seem to set up any topics to advertise with it by default.

That needs some investigation, uxrce is limited to their own predefined set of topics so probably they can get away with using orb_advertise-non-multi . But with Zenoh all topics are exposed an example would be an PX4-based ethernet sensor i.e. IMU this needs to be orb_advertise_multi to work, so we will have to figure out how to solve this under the hood without complicating matters further for end-users.

There's a few other areas in the Zenoh client that I'd like to work on (in particular, publisher/subscriber deletion and automatic topic creation), but I'm not sure how much you intend to use the "native" zenoh client in the future in favor of whatever ROS2 intends to do so wanted to ask before making a PR for that.

Refinements are always welcome, although I'd be a careful on doing too much things dynamically. Zenoh is already quite versatile it doesn't publish data is there are no subscribers for a topic.

PetervdPerk-NXP avatar May 04 '25 13:05 PetervdPerk-NXP

That needs some investigation, uxrce is limited to their own predefined set of topics so probably they can get away with using orb_advertise-non-multi . But with Zenoh all topics are exposed an example would be an PX4-based ethernet sensor i.e. IMU this needs to be orb_advertise_multi to work, so we will have to figure out how to solve this under the hood without complicating matters further for end-users.

Refinements are always welcome, although I'd be a careful on doing too much things dynamically. Zenoh is already quite versatile it doesn't publish data is there are no subscribers for a topic.

My immediate thought is to extend the pub/sub config CSV to let you specify whether a topic is multi or not and then provide a default config, much like uxrce does. The config could be fairly extensive by default - since as you mention Zenoh is pretty efficient about having a bunch of topics - so that beginner users don't have to do much configuration out of the box. The IMU endpoints could then be set up as multi by default so a Zenoh-attached ethernet IMU would "just work." Users who want to customize it (and are thus opting in to additional complexity) can then override the config as a ROMFS overlay or a simple upload.

The thought then is for me to implement the:

  • Extended CSV (and CLI) to allow selection of multi or non-multi,
  • Default set of topics, probably as part of the build process for this.

Also, for some context, I'm working on a drone that'll hopefully be using Zenoh to communicate to peripheral ESCs, IMUs, and other actuators over an automotive ethernet network, so this is a very apt discussion. Thank you!

BenChung avatar May 05 '25 02:05 BenChung

Extended CSV (and CLI) to allow selection of multi or non-multi,

I know this is tempting and I'm even okay with an optional override. But ideally I don't want to have the end-users to deal with whether a topic is multi or non-multi. I was thinking maybe could add hints in the .msg file just like the topic names and based on that intialize it as multi or non-multi. https://github.com/PX4/PX4-Autopilot/blob/3c390952715d64c48fe8e32000b3aeb8a6f85f68/msg/ActuatorOutputs.msg#L8

Default set of topics, probably as part of the build process for this.

This is somewhat in place with, but now it's just an empty string. https://github.com/PX4/PX4-Autopilot/blob/9887fde3c13e1cd97a6dca9cccac3c9391104c15/src/modules/zenoh/zenoh_config.cpp#L54-L55

We could hardcode the defaults there, we could also just parse uxrce yaml file so we maintain compatiblity, while being flexible https://github.com/PX4/PX4-Autopilot/blob/main/src/modules/uxrce_dds_client/dds_topics.yaml and fill the default_pub_config and default_sub_config strings with that.

Also, for some context, I'm working on a drone that'll hopefully be using Zenoh to communicate to peripheral ESCs, IMUs, and other actuators over an automotive ethernet network, so this is a very apt discussion. Thank you!

Cool sounds very similar on what I'm doing, If you want to you can add me on discord petervdperk_nxp

PetervdPerk-NXP avatar May 05 '25 09:05 PetervdPerk-NXP

@mrpollo how do we get lark in the new CI containers? I've added it Tools/setup/requirements.txt but doesn't seems to do trick.

PetervdPerk-NXP avatar Jun 15 '25 17:06 PetervdPerk-NXP

@PetervdPerk-NXP it will require a new dev-container to be released and pushed before we can add to the CI workflow, if you send a PR with just the lark dependency I'll do that for you

mrpollo avatar Jun 16 '25 16:06 mrpollo

Bumping the .github/workflows/checks.yml container shows possibly valid error with shellcheck I think. #23332 added the if [ ${VEHICLE_TYPE} == none ] do @MaEtUgR do you think the CI error is correct?

In /workspace/ROMFS/px4fmu_common/init.d/rcS line 227:
		if [ ${VEHICLE_TYPE} == none ]
                                     ^-- SC3014 (error): In dash, == in place of = is not supported.


In /workspace/ROMFS/px4fmu_common/init.d/rcS line 238:
		if [ ${VEHICLE_TYPE} == none ]
                                     ^-- SC3014 (error): In dash, == in place of = is not supported.

The other new error from lcov might be invalid though.

Processing build/px4_sitl_test/src/modules/mavlink/CMakeFiles/unit-MavlinkStatustextHandler.dir/MavlinkStatustextHandlerTest.cpp.gcda
geninfo: ERROR: mismatched end line for _ZN37MavlinkStatustextHandler_Singles_Test8TestBodyEv at /workspace/src/modules/mavlink/MavlinkStatustextHandlerTest.cpp:44: 44 -> 67
	(use "geninfo --ignore-errors mismatch ..." to bypass this error)
make: *** [Makefile:423: tests_coverage] Error 1

@dagar you okay with adding (use "geninfo --ignore-errors mismatch ..." to bypass this error)?

PetervdPerk-NXP avatar Jun 17 '25 20:06 PetervdPerk-NXP

I'm trying to fix those errors here https://github.com/PX4/PX4-Autopilot/pull/25066

mrpollo avatar Jun 17 '25 20:06 mrpollo

@dagar CI is happy now, could you review and merge?

PetervdPerk-NXP avatar Jul 07 '25 12:07 PetervdPerk-NXP

This pull request has been mentioned on Discussion Forum for PX4, Pixhawk, QGroundControl, MAVSDK, MAVLink. There might be relevant details there:

https://discuss.px4.io/t/px4-dev-call-july-9-2025-team-sync-and-community-q-a/46369/2

DronecodeBot avatar Jul 08 '25 19:07 DronecodeBot

@PetervdPerk-NXP there are couple of pieces in this PR needs to be settled I thinks based on what we discussed today during the call. Besides, it would be cool if we can do it closer to plug and play like with a param for instance to switch between xrce-dds and zenoh. We could maybe discuss it next call maybe if you could join?

farhangnaderi avatar Jul 09 '25 15:07 farhangnaderi

@PetervdPerk-NXP @farhangnaderi At least from my perspective there's no reason xrce-dds and zenoh can't run at the same time (modulo binary size, which a runtime flag wouldn't improve). I'm working to get a vehicle flying that (due to legacy code) is going to be using xrce-dds for position information and zenoh for control, for example.

If I can ask, though, what are the remaining issues that need to be addressed in your view?

BenChung avatar Jul 13 '25 17:07 BenChung

circle.webm

Flight tested this; it's running Julia in closed loop offboard mode on a Pi attached to our custom FC (STM32H745). This is running alongside uxrce which is being used for mocap fusion. Some notes from that integration:

  • Error recovery from platform misconfiguration (e.g. missing urandom, missing the needed mutex support, etc) is not good, it has a habit of infinite looping.
  • It would be good to expose the Zenoh config directory as a param.
  • Zenoh has a bad habit of taking the rest of the FC down with it if the initial setup fails. I think that there's some better scheduler integration we could do.

BenChung avatar Jul 15 '25 08:07 BenChung

This pull request has been mentioned on Discussion Forum for PX4, Pixhawk, QGroundControl, MAVSDK, MAVLink. There might be relevant details there:

https://discuss.px4.io/t/px4-dev-call-july-23-2025-team-sync-and-community-q-a/46617/3

DronecodeBot avatar Jul 23 '25 11:07 DronecodeBot

Testing Needed: SITL & Hardware. Please follow the instructions from Peter.

mrpollo avatar Jul 23 '25 15:07 mrpollo

@dagar @mrpollo @bkueng Okay to merge?

PetervdPerk-NXP avatar Aug 21 '25 07:08 PetervdPerk-NXP

Thanks for the effort, @PetervdPerk-NXP, this is awesome.

mrpollo avatar Aug 22 '25 15:08 mrpollo

Congrats @PetervdPerk-NXP and all the reviewers/contributors on getting this PR through! Fantastic to have Zenoh working with PX4

igalloway avatar Aug 22 '25 16:08 igalloway

This pull request has been mentioned on Discussion Forum for PX4, Pixhawk, QGroundControl, MAVSDK, MAVLink. There might be relevant details there:

https://discuss.px4.io/t/zenoh-on-px4/35267/4

DronecodeBot avatar Aug 28 '25 13:08 DronecodeBot