laser_filters icon indicating copy to clipboard operation
laser_filters copied to clipboard

Feature: ros2 port for all filters

Open berend-kupers opened this issue 1 year ago • 2 comments

Hello,

I have added ROS2 support for the following filters, which did have support in noetic-devel:

  • polygon filter
  • scan blob filter
  • sector filter

Key changes

  • Port polygon filter to ROS2
  • Port scan blob filter to ROS2
  • Port sector filter to ROS2
  • Port launch test for polygon filter and speckle filter to ROS2
  • Port unit tests for speckle filter and scan shadows filter to ROS2
  • Added "remove_shadow_start_point_" parameter to scan shadows filter, which wasn´t included in the port to ROS2

Benefits:

  • More filters available in ROS2

Functionality Status:

  • No changes have been made to the (default) functionality of other filters

Note: I want to mention two issues that I've noticed in the already ported laser filters, which would be best to fix in the filters package.

  1. In box_filter and speckle_filter a (lifecycle-) node is created to create a tf listener (box_filter) and to create a on_set_parameters_callback (speckle_filter). When you start a filter chain and rename the node via launch files, it overrides the name for all node instances. So you get the error:

Publisher already registered for provided node name.

  1. Dynamically setting a parameter doesn´t seem to work, even though the speckles_filter does have an on_set_parameters_callback implemented. If I use ros2 param set ..., I get the error:

Setting parameter failed: parameter 'filter2.params.filter_type' cannot be set because it is read-only

I think these parameters are set to read-only in the filters package.

Please let me know if you have any feedback. I would be happy to make any necessary changes.

Kind regards,

Berend Kupers Lowpad

berend-kupers avatar Jan 22 '24 12:01 berend-kupers

What's the status of this PR? @jonbinney Could you please merge this? I'm interested in using the polygon filter but I'd rather avoid copying the whole implementation to my project.

bjsowa avatar Jul 10 '24 16:07 bjsowa

@bjsowa sorry i haven't had time to review and merge this! Are able to try out this branch and give some feedback?

jonbinney avatar Jul 14 '24 22:07 jonbinney

I tested the polygon_filter and it seems to work after implementing the fixes from my review.

However, I found a general flaw in how some of the filters are implemented. Only the logging and parameter interfaces are passed from the main node to the filters. Thus, any filter that needs to add some Subscriptions or Services, is implemented as LifecycleNode. The problem is, the filter nodes are not spun by any executor, thus the subscription or services callbacks are not called.

The result of it is that, for example, the base_footprint_exclude subscription, as well as, parameter and lifecycle related services in laser_scan_polygon_filter do not work.

IMO, all filters should be implemented as a Node (perhaps rclcpp::Node instead of LifecycleNode), either spawning their own spinning thread like the tf2 transform listener or created as the main node's sub-node

bjsowa avatar Jul 15 '24 16:07 bjsowa

@bjsowa I implemented your comments.

As for your comments about the lifecycle node. I tried to stay close to the implementation of the other filters, e.g. box filter and indeed noticed that for example the parameter reconfigure doesn´t work. I thought it would be out of scope for this PR, but agree it needs to be fixed.

In our code, I solved it by forking the filters package and passing a pointer to the node instead of just the param/logging interfaces. That works for us, but would introduce a breaking change. I can try to get one of your suggestions working, but as I said, I think that should be a separate PR

berend-kupers avatar Jul 16 '24 08:07 berend-kupers

As for your comments about the lifecycle node. I tried to stay close to the implementation of the other filters, e.g. box filter and indeed noticed that for example the parameter reconfigure doesn´t work. I thought it would be out of scope for this PR, but agree it needs to be fixed.

Agree, it can be addressed in future PRs. @jonbinney I have no more feedback.

bjsowa avatar Jul 16 '24 10:07 bjsowa

Thanks for taking a look @bjsowa !

I've got some time to review this PR this week. It looks like tests are failing on some EOL ROS2 distros, so i've updated the CI here: https://github.com/ros-perception/laser_filters/pull/196

jonbinney avatar Jul 16 '24 21:07 jonbinney

@berend-kupers could you rebase onto current head? Then the tests will run on the updated list of distros. Thanks!

jonbinney avatar Jul 16 '24 22:07 jonbinney

(or merge ros2 branch into this one, whichever you prefer)

jonbinney avatar Jul 16 '24 22:07 jonbinney

@jonbinney Done

berend-kupers avatar Jul 18 '24 07:07 berend-kupers

Looks like the speckle filter test is failing on CI, and I have the same issue when I run locally. It looks like an error during rclcpp shutdown - do you know why this is? Maybe because of the multiple nodes with the same name you mentioned in your Note 1 in the PR description?

jonbinney avatar Jul 19 '24 21:07 jonbinney

I've been looking into what changes would be needed to the filters library to avoid needing to create extra nodes. I think a few things:

Add protected member function to FilterBase to get the NodeLoggingInterface. This would enable logging in the filter. (this should be an easy fix)

Either of the following changes for parameters: 1. Add protected member function in FilterBase for filters get the NodeParametersInterface 2. Add ability in FilterBase for filters to mark parameters as writeable and add protected member function in FilterBase to add parameter setting callbacks. The first option is easier to implement, but is a bit weird because I don't think there is a way to create a NodeParametersInterface that_always_ applies a given prefix, and so the filters would each see the parameters for all the other filters. Their callback functions would also be called anytime the parameters are changed for any filter.

Add a transform listener somewhere... and give filters access to it.

OR we could say forget all of this careful passing of only the interfaces needed by the filters, and provide them a getter() for the entire Node object. This seems quite ugly, although i guess it isn't as ugly as using a singleton node in ROS1.

OR we could go crazy and make each filter its own Component which can all then be loaded dynamically or something.....

In any case, I'd like to get most of this PR merged before any of that stuff happens. These filters are quite useful for people, even without some of their features. @berend-kupers what do you think about removing the dynamic reconfiguring of these filters for now, as well as removing the ability to have two different frames in BoxFilter? Then we can review and merge this PR and keep on with the longer discussion of how to add the extra functionality to the filters package?

jonbinney avatar Jul 30 '24 01:07 jonbinney

Still some tests failing - i should have time to debug them later this week to see what is going on.

jonbinney avatar Aug 06 '24 21:08 jonbinney

ok i think i know why the tests are failing intermittently. In test_speckle_filter.test.py the subscribers are created, then the publishers, then the raw laser scan is published, then we wait for the filtered scan. But in the background, asynchronous network stuff is happening to actually hook up the subscribers. So even though the subscribers are created "first" sometimes they don't actually connect until after the filtered laser scan has already come back. In that case we wait forever.

I added

        rate = node.create_rate(1)
        rate.sleep()

after node.start_subscribers() and before node.publish_laser_scan(), and after that i was able to run the test a dozen times with no failures. This is too hacky though..... ideas on a better fix?

jonbinney avatar Aug 10 '24 00:08 jonbinney

Shouldn't making the publisher transient local do the trick?

Otherwise, I could either

  • publish the scan message at a fixed rate, or
  • create another wait_for_() function and use the node.count_subscribers()

berend-kupers avatar Aug 12 '24 08:08 berend-kupers

By "transient local" do you mean make it a local variable? I'm not sure how that would fix this. I like the idea of publishing the scan at a fixed rate though, that sounds simple and robust to failures.

jonbinney avatar Aug 12 '24 23:08 jonbinney

I meant setting the quality of service setting durability to "transient local" for the publisher. Then the publisher should persist samples for "late-joining" subscribers.

But I'm fine with publishing the scan at a fixed rate as well

berend-kupers avatar Aug 13 '24 06:08 berend-kupers

I meant setting the quality of service setting durability to "transient local" for the publisher. Then the publisher should persist samples for "late-joining" subscribers.

Ah, perfect! I'm not familiar with the ROS2 quality of service settings yet, so I just learned something :-)

jonbinney avatar Aug 13 '24 14:08 jonbinney

Interesting - a couple tests still failing but I'm having a harder time reproducing it locally. Debugging now.....

jonbinney avatar Aug 20 '24 22:08 jonbinney

Looks like you changed the QoS on the publisher created in the python test node - that's not the one that's causing the problem. We're missing the return message published by the filter chain and received by the python test node. I tried changing the QoS in the subscriber in the python node, but it failed (something about the underlying middleware or the options used in the filter chain).

The other fix you suggested (publishing at a fixed rate until we get a reply) does work. I've implemented that and sent a PR to your PR branch - could you review and merge that? :-)

If you're curious, here is the command I used to run the tests repeatedly to reproduce the intermittent failure:

colcon test --retest-until-fail=100 --packages-select laser_filters --event-handlers=console_cohesion+ --ctest-args --output-on-failure; colcon test-result --all

jonbinney avatar Aug 20 '24 23:08 jonbinney

Awesome, tests pass! I'll do a code review tomorrow to make sure nothing major is wrong. Since these filters weren't in ros2 at all before, I don't plan on nit-picking about small changes - those can be fixed in smaller PRs later.

jonbinney avatar Aug 22 '24 23:08 jonbinney

Thanks @berend-kupers, and also @bjsowa for reviewing. Sorry it took me so long to get to!

jonbinney avatar Aug 23 '24 19:08 jonbinney