rclpy icon indicating copy to clipboard operation
rclpy copied to clipboard

(NumberOfEntities) improve performance

Open MatthijsBurgh opened this issue 1 year ago • 7 comments
trafficstars

The MultiThreadedExecutor has very bad performance, see https://github.com/ros2/rclpy/issues/1223. Refactoring it, not easy task. So first fixing some low hanging fruit. By improving performance.

These changes are not compatible for classes inheriting from this class. But the repr wasn't compatible either. I don't see any class inheriting from this class any time.

MatthijsBurgh avatar May 17 '24 08:05 MatthijsBurgh

@MatthijsBurgh thanks for creating PR.

Could you add description here? like which issue are you trying to address and background information?

fujitatomoya avatar May 17 '24 22:05 fujitatomoya

@fujitatomoya done

MatthijsBurgh avatar May 18 '24 19:05 MatthijsBurgh

@fujitatomoya done

Can you please explain why this change helps the situation?

clalancette avatar May 19 '24 13:05 clalancette

@clalancette please check the following performance example.

class NumberOfEntities:
    __slots__ = [
        "num_subscriptions",
        "num_guard_conditions",
        "num_timers",
        "num_clients",
        "num_services",
        "num_events",
    ]

    def __init__(
        self,
        num_subs=0,
        num_gcs=0,
        num_timers=0,
        num_clients=0,
        num_services=0,
        num_events=0,
    ):
        self.num_subscriptions = num_subs
        self.num_guard_conditions = num_gcs
        self.num_timers = num_timers
        self.num_clients = num_clients
        self.num_services = num_services
        self.num_events = num_events

    def __add__(self, other):
        result = self.__class__()
        for attr in result.__slots__:
            left = getattr(self, attr)
            right = getattr(other, attr)
            setattr(result, attr, left + right)
        return result

    def __iadd__(self, other):
        for attr in self.__slots__:
            left = getattr(self, attr)
            right = getattr(other, attr)
            setattr(self, attr, left + right)
        return self

    def __repr__(self):
        return "<{0}({1}, {2}, {3}, {4}, {5}, {6})>".format(
            self.__class__.__name__,
            self.num_subscriptions,
            self.num_guard_conditions,
            self.num_timers,
            self.num_clients,
            self.num_services,
            self.num_events,
        )


class NumberOfEntities2:
    __slots__ = [
        "num_subscriptions",
        "num_guard_conditions",
        "num_timers",
        "num_clients",
        "num_services",
        "num_events",
    ]

    def __init__(
        self,
        num_subs=0,
        num_gcs=0,
        num_timers=0,
        num_clients=0,
        num_services=0,
        num_events=0,
    ):
        self.num_subscriptions = num_subs
        self.num_guard_conditions = num_gcs
        self.num_timers = num_timers
        self.num_clients = num_clients
        self.num_services = num_services
        self.num_events = num_events

    def __add__(self, other):
        result = self.__class__()
        result.num_subscriptions = self.num_subscriptions + other.num_subscriptions
        result.num_guard_conditions = self.num_guard_conditions + other.num_guard_conditions
        result.num_timers = self.num_timers + other.num_timers
        result.num_clients = self.num_clients + other.num_clients
        result.num_services = self.num_services + other.num_services
        result.num_events = self.num_events + other.num_events
        return result

    def __iadd__(self, other):
        self.num_subscriptions += other.num_subscriptions
        self.num_guard_conditions += other.num_guard_conditions
        self.num_timers += other.num_timers
        self.num_clients += other.num_clients
        self.num_services += other.num_services
        self.num_events += other.num_events
        return self

    def __repr__(self):
        return "<{0}({1}, {2}, {3}, {4}, {5}, {6})>".format(
            self.__class__.__name__,
            self.num_subscriptions,
            self.num_guard_conditions,
            self.num_timers,
            self.num_clients,
            self.num_services,
            self.num_events,
        )


bla1 = NumberOfEntities(1, 2, 3, 4, 5, 6)
bla2 = NumberOfEntities(1, 2, 3, 4, 5, 6)
bla3 = NumberOfEntities2(1, 2, 3, 4, 5, 6)
bla4 = NumberOfEntities2(1, 2, 3, 4, 5, 6)
for _ in range(1000000):
    bla1 + bla2
real	0m0.753s
user	0m0.748s
sys	0m0.005s
for _ in range(1000000):
    bla1 += bla2
real	0m0.602s
user	0m0.601s
sys	0m0.000s
for _ in range(1000000):
    bla3 + bla4
real	0m0.398s
user	0m0.393s
sys	0m0.004s
for _ in range(1000000):
    bla3 += bla4
real	0m0.337s
user	0m0.333s
sys	0m0.005s

MatthijsBurgh avatar May 20 '24 12:05 MatthijsBurgh

@fujitatomoya your conclusion is correct

MatthijsBurgh avatar May 20 '24 20:05 MatthijsBurgh

@clalancette @sloretz what do you think?

Yep, I think this is entirely reasonable to get a bit more performance out of this.

clalancette avatar May 20 '24 20:05 clalancette

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

fujitatomoya avatar May 21 '24 05:05 fujitatomoya