rclpy
rclpy copied to clipboard
Bad performance of the python wrapper for customized message
Bug report
Required Info:
- Operating System:
- Ubuntu 18.04
- Installation type:
- Binary
- Version or commit hash:
- ROS2 Foxy [I also tried with source installation of the master branch (commit: f2727c10407b6419425e2c0c3da24da8e75543ca)]
- DDS implementation:
- Default
- Client library (if applicable):
- rclpy
Steps to reproduce issue
My customized message structs are as follow:
- Point3D
float64 x
float64 y
float64 z
- Quadrilateral.msg
Point3D vertex1
Point3D vertex2
Point3D vertex3
Point3D vertex4
- Path.msg
uint64 timestamp
uint32 counter
Quadrilateral[] quads
I am trying to send/receive path which is an array of quadrilaterals with custom vertex and 3d point defined. I wrote a simple python pub/sub using ros2 python API to check the latency for sending/receiving the data.
For subscriber:
import rclpy
from rclpy.node import Node
from custom_msg.msg import Path
class MinimalSubscriber(Node):
def __init__(self):
super().__init__('minimal_subscriber')
self.sub = self.create_subscription(Path, "topic1", self.msg_callback, 10)
def msg_callback(self, msg):
time_now = self.get_clock().now().to_msg().sec*1e6 + self.get_clock().now().to_msg().nanosec*1e-3
time_diff = (time_now - msg.timestamp)
self.get_logger().info(f'ID={msg.counter}, time_diff(us)={time_diff}.')
def main(args=None):
rclpy.init(args=args)
minimal_subscriber = MinimalSubscriber()
rclpy.spin(minimal_subscriber)
minimal_subscriber.destroy_node()
rclpy.shutdown()
if __name__ == '__main__':
main()
For publisher:
import rclpy
from rclpy.node import Node
from custom_msg.msg import Path, Point3D, Quadrilateral
class MinimalPublisher(Node):
def __init__(self):
super().__init__('minimal_publisher')
self.pub = self.create_publisher(Path, "topic1", 10)
self.counter = 0
timer_period = 0.3 # seconds
self.timer = self.create_timer(timer_period, self.timer_callback)
def timer_callback(self):
path_msg = Path()
array_size = 2000
# Path rectangles
for idx in range(array_size):
point_msg = Point3D()
point_msg.x = 1.0
point_msg.y = 2.0
point_msg.z = 3.0
quad_msg = Quadrilateral()
quad_msg.vertex1 = point_msg
quad_msg.vertex2 = point_msg
quad_msg.vertex3 = point_msg
quad_msg.vertex4 = point_msg
path_msg.quads.append(quad_msg)
path_msg.timestamp = int(self.get_clock().now().to_msg().sec*1e6 + self.get_clock().now().to_msg().nanosec*1e-3)
path_msg.counter = self.counter
self.pub.publish(path_msg)
self.get_logger().info(f'Publishing, ID={self.counter}')
self.counter = self.counter + 1
def main(args=None):
rclpy.init(args=args)
minimal_publisher = MinimalPublisher()
rclpy.spin(minimal_publisher)
minimal_publisher.destroy_node()
rclpy.shutdown()
if __name__ == '__main__':
main()
I am running my test on Intel® Core™ i7-10875H CPU @ 2.30GHz × 16.
From the output message you can see the performance is bad for 2000 such points on rospy pub/sub.
The problem worsens by increasing number of points.
Both the pub/sub are ros2 py nodes for the test below.
[rospy_subscriber] [INFO] [1634085151.093488686] [rospy_subscriber_1]: ID=2, time_diff(us)=61142.5.
[rospy_subscriber] [INFO] [1634085151.375783202] [rospy_subscriber_1]: ID=3, time_diff(us)=58105.0.
[rospy_subscriber] [INFO] [1634085151.677111317] [rospy_subscriber_1]: ID=4, time_diff(us)=59528.75.
[rospy_subscriber] [INFO] [1634085151.977215465] [rospy_subscriber_1]: ID=5, time_diff(us)=58506.0.
[rospy_subscriber] [INFO] [1634085152.277354668] [rospy_subscriber_1]: ID=6, time_diff(us)=59265.0.
[rospy_subscriber] [INFO] [1634085152.578599392] [rospy_subscriber_1]: ID=7, time_diff(us)=59052.75.
[rospy_subscriber] [INFO] [1634085152.879509733] [rospy_subscriber_1]: ID=8, time_diff(us)=60466.75.
[rospy_subscriber] [INFO] [1634085153.179206110] [rospy_subscriber_1]: ID=9, time_diff(us)=60270.0.
[rospy_subscriber] [INFO] [1634085153.446394519] [rospy_subscriber_1]: ID=10, time_diff(us)=61281.0.
[rospy_subscriber] [INFO] [1634085153.785892480] [rospy_subscriber_1]: ID=11, time_diff(us)=67035.25.
[rospy_subscriber] [INFO] [1634085154.079719437] [rospy_subscriber_1]: ID=12, time_diff(us)=61042.0.
[rospy_subscriber] [INFO] [1634085154.379861563] [rospy_subscriber_1]: ID=13, time_diff(us)=61086.75.
[rospy_subscriber] [INFO] [1634085154.679701551] [rospy_subscriber_1]: ID=14, time_diff(us)=61027.75.
[rospy_subscriber] [INFO] [1634085154.979384697] [rospy_subscriber_1]: ID=15, time_diff(us)=60663.25.
[rospy_subscriber] [INFO] [1634085155.259688132] [rospy_subscriber_1]: ID=16, time_diff(us)=59996.75.
[rospy_subscriber] [INFO] [1634085155.555330209] [rospy_subscriber_1]: ID=17, time_diff(us)=61148.5.
[rospy_subscriber] [INFO] [1634085155.880077077] [rospy_subscriber_1]: ID=18, time_diff(us)=61085.25.
[rospy_subscriber] [INFO] [1634085156.180117333] [rospy_subscriber_1]: ID=19, time_diff(us)=61256.0.
[rospy_subscriber] [INFO] [1634085156.485948720] [rospy_subscriber_1]: ID=20, time_diff(us)=67039.75.
[rospy_subscriber] [INFO] [1634085156.780058947] [rospy_subscriber_1]: ID=21, time_diff(us)=60994.25.
The performance looks much better if the same pub/sub are cpp nodes. Both the pub/sub are ros2 cpp nodes for the test below.
[cpp_subscriber-1] [INFO] [1634084002.770410162] [cpp_subscriber_0]: ID:'0' TimeDiff(us): '681.311000'
[cpp_subscriber-1] [INFO] [1634084003.070118250] [cpp_subscriber_0]: ID:'1' TimeDiff(us): '471.856000'
[cpp_subscriber-1] [INFO] [1634084003.370076587] [cpp_subscriber_0]: ID:'2' TimeDiff(us): '479.345000'
[cpp_subscriber-1] [INFO] [1634084003.670046005] [cpp_subscriber_0]: ID:'3' TimeDiff(us): '484.958000'
[cpp_subscriber-1] [INFO] [1634084003.970028758] [cpp_subscriber_0]: ID:'4' TimeDiff(us): '487.742000'
[cpp_subscriber-1] [INFO] [1634084004.269984672] [cpp_subscriber_0]: ID:'5' TimeDiff(us): '489.848000'
[cpp_subscriber-1] [INFO] [1634084004.569917083] [cpp_subscriber_0]: ID:'6' TimeDiff(us): '471.193000'
[cpp_subscriber-1] [INFO] [1634084004.869907160] [cpp_subscriber_0]: ID:'7' TimeDiff(us): '482.498000'
[cpp_subscriber-1] [INFO] [1634084005.169864305] [cpp_subscriber_0]: ID:'8' TimeDiff(us): '479.010000'
[cpp_subscriber-1] [INFO] [1634084005.469827717] [cpp_subscriber_0]: ID:'9' TimeDiff(us): '480.361000'
[cpp_subscriber-1] [INFO] [1634084005.769788033] [cpp_subscriber_0]: ID:'10' TimeDiff(us): '482.940000'
Expected behavior
Actual behavior
Additional information
Feature request
Feature description
Implementation considerations
FYI, context about this issue, from https://github.com/ros2/ros2/issues/1200 (quoting @pulkitgoyal1506):
@aprotyas The fix applied to master branch here ros2/rosidl_python#129 does not solve the problem of delays with ros2 python node, I mentioned above.
I compiled ros2 master from source using the below commit. ros2/rosidl_python@f2727c1
The problem still remains the same for me.
Hi, reposting my comment from https://github.com/ros2/rosidl_python/pull/129 :
I have a CPP service that sends data in the form of sensor_msgs/Image. For image of size 3 x 720 x 1280, it takes 350 ms for the image to be received in a Python client. For an image of size 3 x 480 x 640, it takes 100 ms. Since the time as well as size difference is close to 3, it indicates that latency becomes worse as the data size increases similar to the observation by @pulkitgoyal1506. Also looks like its present when publishing from CPP and receiving in Python and not just from Python to Python.