rclcpp icon indicating copy to clipboard operation
rclcpp copied to clipboard

Subscriber is being held when another subscriber callback is being called

Open TonyWelte opened this issue 3 months ago • 23 comments

Bug report

Required Info:

  • Operating System:
    • Ubuntu 24.04 (docker and host)
  • Installation type:
    • binaries
  • Version or commit hash:
    • jazzy
  • DDS implementation:
    • Fast-RTPS
  • Client library (if applicable):
    • rclcpp

Steps to reproduce issue

#include <rclcpp/rclcpp.hpp>

#include <std_msgs/msg/empty.hpp>

class MyClass {
  public:
    MyClass(rclcpp::Node& node) {
        std::cout << "Constructor: " << this << std::endl;

        _sub = node.create_subscription<std_msgs::msg::Empty>("test", 10, std::bind(&MyClass::callback, this, std::placeholders::_1));
    }

    ~MyClass() {
        std::cout << "Destructor:  " << this << ", Count: " << _sub.use_count() << std::endl;
    }

    void callback(const std_msgs::msg::Empty&) {
        std::cout << "Callback:    " << this << std::endl;
        my_data += 1.0;
    } 

  private:
  rclcpp::Subscription<std_msgs::msg::Empty>::SharedPtr _sub;

  double my_data;
};

int main(int argc, char* argv[]) {
    rclcpp::init(argc, argv);
    
    auto node = std::make_shared<rclcpp::Node>("test_node");

    std::unique_ptr<MyClass> object;
    
    auto sub = node->create_subscription<std_msgs::msg::Empty>("sub", 10, [&object, &node](const std_msgs::msg::Empty&){
        if(object){
            object.reset();
        }
        else {
            object = std::make_unique<MyClass>(*node);
        }
    });

    rclcpp::spin(node);

    rclcpp::shutdown();
    
    return 0;
}

You should be able to build the executable with just rclcpp and std_msgs linked.

I recommend compiling with the address sanitizer to make sure it crashes once the issue occurs. Without it, it might not crash but it's definitely UB.

colcon build --cmake-args -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_FLAGS=-fsanitize=address

In three terminals

# Term1
ros2 topic pub --rate 10.0 /sub std_msgs/msg/Empty '{}'
# Term2
ros2 topic pub --rate 100.1 /test std_msgs/msg/Empty '{}'
# Term3
ASAN_OPTIONS=new_delete_type_mismatch=0 ros2 run YOUR_TEST_PACKAGE YOU_TEST_EXECUTABLE

Expected behavior

My understanding was that the executor had weak ownership of the subscriber and would only get a shared_ptr when executing the callback. Hence, in a single threaded context when a callback from one subscriber is called it should be possible to destroy another subscriber and clean related objects.

Even in a multi threaded context I would expect the same if the two subscribers interacting with each other are part a mutually exclusive callback group.

Actual behavior

Here when the callback for /sub is called MyClass is destroyed but the subscriber to /test it contains is held by someone else (the count of the share_ptr is 2).

This worked fine with ROS2 humble, the subscriber count was 1 when it's callback wasn't being called.

Assuming this is the intended behavior it becomes difficult cleanup objects used by subscribers (or containing subscribers as in the example). I'm sure it still feasible by keeping a weak_ptr before resting our shared_ptr, then the executor will hopefully release its shared_ptr and only after that destroy the object. But what was as simple as destroying the object becomes more convoluted.

Additional information

I found these issues that are somewhat related but I think it's a different issue since they had callbacks called in parallel which is not the case here:

  • https://github.com/ros-navigation/navigation2/issues/4166#issuecomment-1988937428
  • https://github.com/ros2/rclcpp/issues/2447

TonyWelte avatar Nov 20 '24 09:11 TonyWelte