ROS2 Zenoh Support
Hey hey! Finally getting a chance to use the project and love it.
I was curious if there's been any discussion of "natively" interacting with ROS2 over Zenoh now that it's the default rmw (and supported earlier with an option). Seems like it'd be similar to the TCPROS integration except using Zenoh as the message layer and with whatever changes ROS2 made to the message protocol.
YES! There has been a lot of discussion on this and it is 100% something I want to support.
I currently don't have a timeline / plan attached to building out that support, but given all the infrastructure we already have I don't expect it to be a very heavy lift.
Sweet! Were there to be someone with some time to dedicate to this, where would one start for examples, specifically on integration testing?
It certainly seems like the Zenoh ROS1 as is would make a very compelling starting point.
Oh heck yeah! Super happy to help support someone diving into this!
Honestly the first thing that I need is a little bit of reverse engineering what is going on behind rmw in ROS2. I don't actually know the following:
- How are ROS2 messages encoded for Zenoh? I'm assuming among the other ROS2 rust crates we can find the equivalent of https://github.com/adnanademovic/serde_rosmsg , but we really need something based off of Serde or our whole message generation stuff will have to change.
- How are topics and types encoded or mangled? zenoh-ros1-bridge does some "interesting" translation from ros topic name to zenoh topic name: https://github.com/eclipse-zenoh/zenoh-plugin-ros1/issues/131. I wouldn't be surprised if rmw was doing something similar. Again just need to look into what the existing ROS2 rust crates do.
As far as setting up unit / integration tests go:
- We already have a pretty solid ROS2 integration test setup with Github actions. Checkout: https://github.com/RosLibRust/roslibrust/blob/master/.github/workflows/iron.yml
- If you write a test and feature flag it with
#[cfg(feature = "ros2_test")], it will automatically run in CI in a ROS2 docker image with the rosapi node running. I would use that node as your "target" for things like subscribing against and service calls. - That image (I assume?) is currently using DDS as the rmw backend. So we'd need to look at: https://github.com/RosLibRust/roslibrust/blob/master/docker/iron/Dockerfile and likely configure a ROS2 Zenoh specific image.
Breaking down the steps now that I've thought about it a bit more.
- Use Iron Dockerfile as an example to create a new docker image that sets up a ROS2 environment with Zenoh configured as the rmw backend. This may be as easy as setting up a Kilted docker image in the exact same way?
- Use Iron workflow.yml as an example to create a new github workflow that uses the Zenoh docker image and invokes tests against it.
- Actually, start development. On the backend.
Assuming you might not want to setup all of this infrastructure at the start, and just want to dive in on the backend implementation. When I develop locally what I typically do is:
- Get a docker image that has a ROS2 environment in it
- Startup that docker image in "network_mode: host" with some nodes running
- See https://github.com/RosLibRust/roslibrust/blob/master/docker/noetic_compose.yaml for example
- On my local system then start developing the code and trying to talk to the nodes, if needed drop into a shell in that docker image and use command line tools with wireshark to snoop on networking and see how things are talking.
Sorry for long response, but thinking through this was actually super helpful for me as well.
This is awesome! I'll take a stab at this sometime and see what I can find. It'd be sick to have a "native" ROS2 integration that required none of the workspace/environment trappings of the full system.
A little investigation and it looks like this is how ROS2 messages are generated: https://github.com/ros2-rust/rosidl_rust/blob/main/rosidl_generator_rs/rosidl_generator_rs/init.py
Templates in here: https://github.com/ros2-rust/rosidl_rust/tree/main/rosidl_generator_rs/resource
It’s all done via CMake and a Python script generating actual file output at build time with manual templates. Not the prettiest but I guess it fits the usual pattern.
I see two potential paths:
- Reimplement it all as a proc_macro. Cleaner but risks diverging and could be decent amount of work (but also ROS messages are reasonably simple so maybe not)
- Wrap this Python in a build.rs and kind of kludge it into being
Thoughts?
Doing some digging today...
- R2R - Uses the rosidl stuff via CMake to generate the C files for message serialization and then creates Rust binding to those. Clever, but means you still need ROS2 installed to generate the serialization code.
- rclrs - Looks like it actually does the same thing? Uses rosidl_runtime_rs to bind to genearated C libraries... They have a longer documented explanation here. The actual generation library lives here. And is written by esteve whom I've met at RosCon.
Digging into the implementation of rosidl_rust it is again a python package...
So we're left with two major options, which largely depend on appetite and belief about the "right way" to work with the ROS2 ecosystem.
rclrs upholds the "ROS idomatic" approach, and has built a ROS2 compatible generator, which build correct bindings to pass the correct data to rmw. R2R is doing roughly the same thing, but has taken some shortcuts to make things easier to write. Our first option is just to adopt this existing infrastructure, but I'm honestly not sure it would actually work... Both rclrs and r2r aren't actually doing the serialization in Rust they are binding to rmw and just passing the correct structs into it via an FFI. If we actually wanted a "pure Rust" no ROS baggage implementation neither of these would work. We could instead just end-up writing a roslibrust wrapper for r2r...
The second option (which is what I prefer), is to actually figure out what rmw is doing to perform the serialization and re-create that byte-for-byte in Rust.
- Pro: Would give us a Pure Rust, Pure Cargo, No ROS dependency implementation
- Con: Goes against what ROS2 was trying to accomplish with RMW and creates a fork in the ecosystem where they can no longer change rmw in a backwards compatible way and still have the whole ecosystem work. I'm sure some ROS developer will end up cursing our name if we go too far down this route.
I'm going to assume for the time being, that we're interested in this second option... Digging deeper on what this would take.
rmw_zenoh_cpp specifies "cdr" as the serialization type for the zenoh middleware.
Their design doc calls this out more... So I think if we can build a serializer that does "cdr" we should "just work" with the ecosystem (for now).
Digging around for "cdr" implementations:
- Here is foxglove's js impl: https://github.com/foxglove/cdr
- Here is a python impl: https://gitlab.com/ternaris/rosbags/-/blob/master/src/rosbags/serde/cdr.py?ref_type=heads
- OKAY HERE WE GO: https://github.com/hrektts/cdr-rs There a rust native implementation, and it works with serde!
So here is an updated implementation plan based off of what I've learned.
- We should test drive cdr-rs and figure out if we pass the existing roslibrust generated message types into it, if we get byte-for-byte identical serialization matching rmw. Definitely would need some unit testing around this, and capturing "good" example ros2 messages with some rosbags. Once we have matching serialization going we should be most of the way there.
- We can look at what rmw_zenoh_cpp is doing for all the topic mangling, service calls, ect. . I can't imagine it isn't mapping things pretty dang close to zenoh standard. So it should be pretty quick to prototype some
zenoh.pub(cd-rs::serialize(my_msg))and just see if we can get a ROS2 node to receive the publish
Honestly, now that I know "cdr" is the serialization format rmw_zenoh_cpp is using, and that there are example implementations of it in multiple languages + a Rust implementation that might already work, I'm starting to feel pretty confident about this approach.
Again, I apologize for the long message, but if I don't write all of this down somewhere I forget it immediately, and these public issues / discussions end up serving as a really good record of why these decisions were made!
Hell yeah, that's great info. I'm glad cdr is apparently decently well supported out there. I'll do some poking around for the manging part and type declarations.
Oh hey, this looks like the ticket: https://github.com/ros2/rmw_zenoh/blob/f3ac079fb98ea72a690c04f6d93ab10293a142e8/rmw_zenoh_cpp/src/detail/liveliness_utils.cpp#L87
topic_keyexpr_ = std::to_string(domain_id);
topic_keyexpr_ += "/";
topic_keyexpr_ += strip_slashes(name_);
topic_keyexpr_ += "/";
topic_keyexpr_ += type_;
topic_keyexpr_ += "/";
topic_keyexpr_ += type_hash_;
Simple enough!
Reading everything back, I think it is worth explicitly writing out the design I'm hoping for as plainly as I can. With the risks / justifications:
- Existing roslibrust_codegen and roslibrust_codegen_macro will be used to parse ROS1 and ROS2 idl files and generate Rust structs with serde derive macros on them. a. Risk: I'm assuming / hoping that the SAME Rust struct representation will work for ROS1 / ROS2 / rosbridge etc. and we'll be able to fit any ROS2 specific logic into the serializer. b. Performance Loss: Our Rust representation will NOT be a byte-for-byte and memcpy equivalent representation for serialization. We'll always pay some performance penalty when converting between Rust and ROS2 representation. rclrs ends up generating two different version of each struct, one that is "Rust native" and one that is "byte for byte" but less ergonomic. We could do the same some day if we cared. c. Justification: This is how we keep roslibrust homogenous across backends and prevent ROS2 differences from creeping into roslibrust.
- We'll either rely directly on cdr-rs or on a fork of it that we end up rolling ourselves to do Rust -> CDR serialization and de-serialization. a. Risk: This matches what the rest of roslibrust has done with relying on Serde's serialization framework, which makes things super flexible, but does take a performance hit at runtime. roslibrust is likely to always be a little be slower than r2r or rclrs at serialization/deserialization. I'm willing to accept this, in exchange for freedom from ROS2's build system and dependencies. b. Risk: We'll need good unit tests, and are still likely to experience serialization bugs as we're not using the rmw implementation everyone else is using. c. Risk: Reading around the ROS2 ecosystem it sounds like they considering other serialization formats, so we may have to end up expanding the list of serialization formats we support if they add anything else. However, likely to be a Rust version of whatever they pick anyway. d. Justification: This is the only way to separate ourselves from having a Python + CMake + C-Binding dependency that would fundamentally destroy our support story. roslibrust should always be as easy to add as a dependency as any other rust crate.
- We'll roll our own Zenoh interactions directly in a roslibrust_ros2 backend implementation that match what rmw_zenoh_cpp is doing in terms of topic names and translation between Zenoh idioms and ROS2 idioms. a. Risk: If/when rmw_zenoh_cpp changes its topic naming, data format, or whatever, our shit will break and need to be updated. We'll need integration tests against rmw_zenoh_cpp in CI. Luckily, we should be able to detect problems in Rolling and have time to fix them. b. Justification: This is the same decision we made with everything else in roslibrust, which is to duplicate ROS's existing behavior. Was more safe to do this when ROS1 was "frozen", but still we're willing to eat some pain of maintain a copy of the behavior in Rust in exchange for toolchain freedom.
- Made to record my reasonings when some ROS dev curses my name in a couple years...
Oh hey, this looks like the ticket: https://github.com/ros2/rmw_zenoh/blob/f3ac079fb98ea72a690c04f6d93ab10293a142e8/rmw_zenoh_cpp/src/detail/liveliness_utils.cpp#L87
topic_keyexpr_ = std::to_string(domain_id); topic_keyexpr_ += "/"; topic_keyexpr_ += strip_slashes(name_); topic_keyexpr_ += "/"; topic_keyexpr_ += type_; topic_keyexpr_ += "/"; topic_keyexpr_ += type_hash_; Simple enough!
Heck yeah!
Super similar to what zenoh_bridge_ros1 was doing: https://github.com/RosLibRust/roslibrust/blob/95d6458fdc524d2463941b13faee3417fcec716f/roslibrust_zenoh/src/lib.rs#L125
Makes sense to me to just rely on cdr-rs. Given that the message format is at least explicitly declared now and not just the implicit thing it was in ROS1, we at least have some sensible migration path should it change.
I probably won't be able to get to this for a month but having this would be a major boon to a project I'm working on so I'll keep an eye out here if anything moves before I can jump in!
Yeah you've got me excited about it now... I'm likely to take a swing at implementing at least parts of this in the next couple weeks.
Yeah this nerd sniped me... But I got a successful subscribe and deserialize of std_msgs::String working:
Started a ros2-experiments branch where we can play with this.
https://github.com/RosLibRust/roslibrust/pull/242
Little update on progress in #242 and noting my plans for next steps.
- We have a successful end-to-end CI test in which we subscribe to a ROS2 Zenoh topic and actually get and decode the data.
- A huge amount of the work so far has gone into implementing the calculation of the ROS2 message type hash described here. I think the implementation we have now is mostly correct, but the actual code to do it turned into a lot of bad spaghetti as it broke a lot of the previous assumptions of our message generation code. I think we can run with it for now, but it needs a re-write eventually.
I've discovered that a large amount of how ROS functionalities were achieved with Zenoh is by creating various "liveliness tokens" that declare to the zenoh system "ROS facts". You create a specific token to declare that you are node that exists, you create a specific token to declare a subscription, etc. . Without creating these tokens you can still participate in ROS communication, but you do so "invisibly" in a way that ROS tools won't indicate there is a Node there.
At this point I think we're fairly close to being able to merge #242 onto master. Due to some annoyances with calculating the Hash I had to break the ROS2 Time and Duration types, from being backwards compatible with the ROS1 types. I do want to try to clean this up before landing that branch on Master.
After that I think we can make a large series of individual issues and PRs to start pulling the ROS2 system into "full compliance". Here are my thoughts for the next issues to go after in rough order:
- Have ROS2 nodes correctly declare their existence on creation.
- Integration test via ros2 CLI to show node is reported
- Integration test via ros2 CLI to show node is no-longer reported after being dropped
- Have ROS2 nodes correctly declare their subscriptions
- Integration test
- Create a ROS2 subscriber example
- Have ROS2 nodes correctly declare their publishing/advertisement
- Integration test
- Publisher example
- Build some End-to-End tests and Examples showing our ROS2 code can communicate with itself in a loop
- Same for Service Clients
- Same for Service Servers
- Build some End-to-End test and Examples showing we can call our own services
- Finalize work:
- impl the Ros trait for the ROS2 handle
- Add the ros2 crate as a dependency of central crate behind a ros2 feature flag
- Documentation
Hello, ROS-Z developer here! I'm glad to see more and more people seeking the Zenoh native ROS solution. We also went through a similar study in the past months, and we began an experimental project, ROS-Z based on the study and our experience with rmw_zenoh_cpp.
We've spent lots of time ensuring the performance is better than rmw_zenoh and comes with less overhead. And we don't need the extra cost of Waitset required in rclrs and r2r.
The last puzzle in ROS-Z is the ROS message generation from IDL. I believe this issue has been well summarized by @Carter12s. IMO, I will lean towards a pure Rust solution. Currently, we are trying to port the basic messages https://github.com/ros2/common_interfaces to see if it's gonna work.
@YuanYuYuan this is fantastic! I absolutely think we should collaborate here to arrive at a single unified solution. roslibrust is looking for the exact same advantages that ROS-Z is looking for.
If we're able to fit ROS-Z behind roslibrust's "standardized trait API" as one of the backends roslibrust supports, we'll be able to provide some real benefit to the whole Rust/ROS eco-system.
I see that ROS-Z is in need of Rust type generation from ROS's format which is work we largely have complete in roslibrust as fully stand-alone pure Rust. I highly suggest that ROS-Z should try to leverage the work we've already completed in roslibrust_codegen, and would happily offer support to expand features / fix bugs in that crate to support ROS-Z adopting it.
ROS-Z code should be able to depend on types implementing roslibrust's RosMessageType trait which will provide both the serde operations needed to work with CDR and access to the ROS_2 hash and the dds name so that the "with_type_info" call is no longer needed.
If you checkout this link: https://github.com/RosLibRust/roslibrust/blob/3837e7e659e71ad257baef4efbcf5b43cd202926/roslibrust_test/src/ros2.rs
You'll see example outputs from roslibrust_codegen that include the needed ROS2_HASH and ROS2_TYPE_NAME fields. These types are compile-time generatable in roslibrust via either a proc_macro or build.rs file. The underlying code dynamically finds ".msg/.srv" files, parses them, and generates the appropriate Rust definitions. I think this is exactly what ROS-Z needs!
I do think after reviewing what ROS-Z has started, we're definitely going to abandon attempting to roll this completely from scratch ourselves, and I'll start experimenting with adding ROS-Z directly as the ROS2 backend implementation for roslibrust.
@YuanYuYuan I did some experiments this morning and was able to demonstrate succesful interop between roslibrust_codegen's types and ros-z.
Checkout: https://github.com/RosLibRust/roslibrust/blob/6fff596240f89b55a3000c45df037d516eaa28bd/roslibrust_ros2/src/lib.rs
I ran into some challenges that I'll open some issues for on ROS-Z, but overall I think roslibrust leveraging ROS-Z as its backend is extremely promising.
Hi everyone, had a brief text with Carter and wanted to post here to say that I’d love to contribute!
- Experience with ROS: ROS2 with C++ and Python; worked on manipulation (Robot Arms), state estimation (Mobile Robots), and Deep learning based Computer Vision.
- Availability: Around 5–8 hours per week for the next 2 months.
- Rust: I'm very new to Rust, hoping to learn more while working on this project.
Here is my LinkedIn – linkedin.com/in/dheerajbhurewar
Looking forward to helping with the migration and learning more in the process!
I've merged a branch which adds very basic initial support leveraging ROS-Z.
Lot's of next steps required I'll break into sub-issues from this one.
Going to use this as general tracking issue for ROS2 support and close this issue when we fully release support.