mesa icon indicating copy to clipboard operation
mesa copied to clipboard

Improve AgentSet method chaining performance

Open quaquel opened this issue 1 year ago • 4 comments

While adding groupby and updating select, there was some discussion on the performance of method chaining. The current code loops over all agents in each method. However, if you are chaining methods, this can become quite inefficient. Because each loop involves turning weakrefs back into hard refs, applying the operation, and creating a new weakref. Moreover, you are looping over all agents multiple times.

@EwoutH mentioned exploring deferred execution, while @Corvince suggested using generators. The main advance of chaining generators is that it is possible to let each operation in the chain operate on a single agent. So, you are effectively looping over all agents only once. Moreover, it might be possible to pass the actual agent from operation to operation so that only at the beginning the weakref is resolved and only at the end of all operations a new weakref is created.

Exactly how chaining generators would work, what the API implications would be, and what the performance gains are remains to be explored. As a first step, it is probably a good idea to develop a proof of principle. Next, we can discuss how this proof of principle can be used with the existing AgentSet, assuming real performance benefits exist.

quaquel avatar Sep 01 '24 07:09 quaquel

Before jumping to a POC, maybe let's think about this a bit more.

What should this do

agentset.do("foo").do("bar")

It should do foo and then bar. But what if bar depends on foo? So should it be

# A)
for agent in agentset:
    agent.foo()
    agent.bar()

# B)
for agent in agentset:
    agent.foo()

for agent in agentset:
    agent.bar()

I think B) is more common in abms - it's interactions what makes abm interesting.

But for that case we can't use generators - we have to evaluate first.

For agentset.shuffle() it also doesn't make sense to use generators (reasoning left as an exercise for the reader ;) )

That leaves agentset.select. Conceptually it makes sense to return an iterator here. But I don't have an idea on how this may look like. Maybe an LazyAgentset

Or we scrap this all together and users just need to use an good old for loop for best performance. For which the semantics are already good

Corvince avatar Sep 02 '24 05:09 Corvince

While thinking about this over the weekend, I also realized it makes no sense for shuffle. Your point about do is also very clear. It might make sense for set and get in addition to select. I, however, doubt the hassle, code and API wise, is worth the performance gains.

quaquel avatar Sep 02 '24 06:09 quaquel

Some experiments, somewhat related:

  • https://github.com/projectmesa/mesa/pull/2283
  • https://github.com/projectmesa/mesa/pull/2284

EwoutH avatar Sep 06 '24 11:09 EwoutH

I was thinking, I preferably don't want to change the user API too much with specialty functions. So after brainstorming a bit (conversation), adding a _pending_shuffle flag could help:

class AgentSet:
    def __init__(self, agents, random: Random = None):
        self._agents = agents  # Assuming _agents is a dict or similar structure
        self.random = random or Random()
        self._pending_shuffle = False  # Internal flag to track shuffle

    def shuffle(self):
        self._pending_shuffle = True  # Set the flag
        return self  # Allow method chaining

    def do(self, method: str | Callable, *args, **kwargs) -> 'AgentSet':
        if self._pending_shuffle:
            # If shuffle was called before do, use shuffle_do
            self._pending_shuffle = False  # Reset the flag
            return self.shuffle_do(method, *args, **kwargs)
        
        # Original do implementation
        if isinstance(method, str):
            for agent in self._agents:
                getattr(agent, method)(*args, **kwargs)
        else:
            for agent in self._agents:
                method(agent, *args, **kwargs)
        return self

    def shuffle_do(self, method: str | Callable, *args, **kwargs) -> 'AgentSet':
        # Optimized shuffle_do implementation
        agents = list(self._agents.keys())  # Assuming _agents is a dict
        self.random.shuffle(agents)

        if isinstance(method, str):
            for agent in agents:
                getattr(agent, method)(*args, **kwargs)
        else:
            for agent in agents:
                method(agent, *args, **kwargs)

        return self

However, this will break calling shuffle() on itself. So, maybe we add a argument to shuffle(), like delay=True, which will delay the shuffle to the next function. Then functions like do, map and select could accept this delayed shuffle.

EwoutH avatar Sep 19 '24 15:09 EwoutH