rclpy icon indicating copy to clipboard operation
rclpy copied to clipboard

Memory leak in ActionServer. ```taken_data = self._handle.take_goal_request```

Open holesond opened this issue 1 year ago • 4 comments
trafficstars

Bug report

Required Info:

  • Operating System:
    • Ubuntu 22.04
  • Installation type:
    • binaries
  • Version or commit hash:
    • 3.3.13 (Humble)
  • DDS implementation:
    • rmw_cyclonedds_cpp
  • Client library (if applicable):
    • rclpy

Steps to reproduce issue

NOTE: The zipped fibonacci_package.zip ROS2 package contains all the files required to reproduce the issue.

Action definition Fibonacci.action is similar to https://docs.ros.org/en/jazzy/Tutorials/Intermediate/Creating-an-Action.html.

int32 order
int32[] data
---
int32[] sequence
---
int32[] partial_sequence

I adapted the action client and server from https://docs.ros.org/en/jazzy/Tutorials/Intermediate/Writing-an-Action-Server-Client/Py.html.

Action client fibonacci_action_client.py:

#!/usr/bin/env python3

import rclpy
from rclpy.action import ActionClient
from rclpy.node import Node

from fibonacci_package.action import Fibonacci



class FibonacciActionClient(Node):

    def __init__(self):
        super().__init__('fibonacci_action_client')
        self._action_client = ActionClient(self, Fibonacci, 'fibonacci')
        self.data_buffer = [int(i) for i in range(1000000)]

    def send_goal(self, order):
        goal_msg = Fibonacci.Goal()
        goal_msg.order = order
        goal_msg.data = self.data_buffer
        self._action_client.wait_for_server()
        return self._action_client.send_goal_async(goal_msg)


def main(args=None):
    rclpy.init(args=args)
    action_client = FibonacciActionClient()
    for i in range(500):
        future = action_client.send_goal(10)
        rclpy.spin_until_future_complete(action_client, future)


if __name__ == '__main__':
    main()

Action server fibonacci_action_server.py:

#!/usr/bin/env python3

import time
import tracemalloc

import rclpy
from rclpy.action import ActionServer
from rclpy.node import Node

from fibonacci_package.action import Fibonacci



class FibonacciActionServer(Node):

    def __init__(self):
        super().__init__('fibonacci_action_server')
        self._action_server = ActionServer(
            self,
            Fibonacci,
            'fibonacci',
            self.execute_callback)

    def execute_callback(self, goal_handle):
        self.get_logger().info('Executing goal...')
        goal_handle.succeed()
        result = Fibonacci.Result()
        return result


def main(args=None):
    tracemalloc.start()

    rclpy.init(args=args)
    fibonacci_action_server = FibonacciActionServer()

    snapshot1 = tracemalloc.take_snapshot()
    t0 = time.time()
    while time.time()-t0 < 60.0:
        rclpy.spin_once(fibonacci_action_server)
    snapshot2 = tracemalloc.take_snapshot()
    top_stats = snapshot2.compare_to(snapshot1, 'lineno')

    print("[ Top 10 differences ]")
    for stat in top_stats[:10]:
        print(stat)


if __name__ == '__main__':
    main()

Start the client before the server, each in a separate terminal:

ros2 run fibonacci_package fibonacci_action_client.py
ros2 run fibonacci_package fibonacci_action_server.py

Expected behavior

Constant memory footprint of both server and client. (Below 100 MiB per process.)

Actual behavior

The client memory footprint is constant (below 78 MiB). However, the server memory footprint grows over time at the rate of ca. 1.3 GiB per minute.

The server prints:

[INFO] [1718631498.683267906] [fibonacci_action_server]: Executing goal...
[INFO] [1718631498.864721171] [fibonacci_action_server]: Executing goal...
(...)
[INFO] [1718631499.046434560] [fibonacci_action_server]: Executing goal...
[INFO] [1718631499.229864275] [fibonacci_action_server]: Executing goal...
[ Top 10 differences ]
/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/action/server.py:453: size=1325 MiB (+1325 MiB), count=1817 (+1817), average=747 KiB
/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/action/server.py:68: size=92.7 KiB (+92.7 KiB), count=1479 (+1479), average=64 B
/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/action/server.py:300: size=47.7 KiB (+47.7 KiB), count=982 (+982), average=50 B
/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/task.py:32: size=46.4 KiB (+46.4 KiB), count=657 (+657), average=72 B
/opt/ros/humble/local/lib/python3.10/dist-packages/unique_identifier_msgs/msg/_uuid.py:77: size=41.2 KiB (+41.2 KiB), count=977 (+977), average=43 B
/usr/lib/python3.10/posixpath.py:373: size=40.3 KiB (+40.3 KiB), count=349 (+349), average=118 B
/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/action/server.py:301: size=40.0 KiB (+40.0 KiB), count=655 (+655), average=63 B
/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/action/server.py:312: size=39.2 KiB (+39.2 KiB), count=645 (+645), average=62 B
/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/action/server.py:275: size=33.3 KiB (+33.3 KiB), count=581 (+581), average=59 B
/usr/lib/python3.10/linecache.py:137: size=33.2 KiB (+33.2 KiB), count=326 (+326), average=104 B

Additional information

The memory leak seems to happen in /opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/action/server.py, note the allocated 1325 MiB reported by tracemalloc.

Line 453 of /opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/action/server.py is:

taken_data = self._handle.take_goal_request(
    self._action_type.Impl.SendGoalService.Request,
)

which calls https://github.com/ros2/rclpy/blob/humble/rclpy/src/rclpy/action_server.cpp#L127.

holesond avatar Jun 17 '24 14:06 holesond

Please note that the results will be retained on the action server for a while (the default is 900 seconds).

https://github.com/ros2/rclpy/blob/091f7eb64af8a5e52db4c8fbc6818499ec563a0a/rclpy/rclpy/action/server.py#L200

You can modify this parameter to determine if it's causing the issue.

Barry-Xu-2018 avatar Jun 18 '24 02:06 Barry-Xu-2018

i think this can be resolved with https://github.com/ros2/rclpy/pull/1171, default value is way too big...

fujitatomoya avatar Jun 18 '24 22:06 fujitatomoya

Thank you @Barry-Xu-2018 and @fujitatomoya for the suggestions. Setting result_timeout=1 has fixed the memory leak for me. A ten second long result_timeout may still be too long for larger request or response payloads such as images.

holesond avatar Jun 20 '24 14:06 holesond

@holesond you are welcome.

A ten second long result_timeout may still be too long for larger request or response payloads such as images.

https://github.com/ros2/rclpy/pull/1171 changes default timeout, you can configure the timeout as they like.

fujitatomoya avatar Jun 20 '24 15:06 fujitatomoya