mesa replace weakref dict with normal dict

Just noticed that with the cell space testing agentrefs became the main bottleneck. This PR is just to run the benchmark code and see how much difference it makes. Maybe we can talk about the advantages and disadvantages of weakrefs some more. (That is to say I want to start a discussion, not necessarily get this merged).

Jan 23 '24 20:01 Corvince

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
Schelling	small	🟢 -22.3% [-22.6%, -22.0%]	🟢 -25.9% [-26.1%, -25.8%]
Schelling	large	🟢 -23.7% [-24.3%, -23.1%]	🟢 -26.4% [-27.2%, -25.6%]
WolfSheep	small	🟢 -25.9% [-26.4%, -25.5%]	🟢 -22.3% [-22.4%, -22.2%]
WolfSheep	large	🟢 -48.0% [-57.1%, -37.6%]	🟢 -24.7% [-25.5%, -24.0%]
BoidFlockers	small	🟢 -6.7% [-7.2%, -6.1%]	🔵 -1.9% [-2.5%, -1.3%]
BoidFlockers	large	🟢 -7.0% [-7.6%, -6.3%]	🔵 +0.1% [-0.5%, +0.7%]

Jan 23 '24 20:01 github-actions[bot]

That are some serious numbers. Quite consistent reduction in runtime of ~25%.

Wonder how we can keep the functionality of weakness without the (majority of) the performance impact.

At least good to know that this is the effect.

Jan 23 '24 20:01 EwoutH

Cython has __weakref__ functionality that could be tested for performance. I will allocate a time this week to migrate the build system from Hatch to Meson, so that experimenting with Cython code is much easier.

Jan 23 '24 21:01 rht

Didn't we just migrate to hatch? Anyway, that would be great!

Jan 23 '24 21:01 EwoutH

Thanks for checking this. It was also on my list of things to check. I am surprised by the overhead of weakrefs. I knew it is done in pure python and that there thus would be overhead. I never imagined that it was this big.

The reason for using weakrefs in AgentSet was to avoid memory leaks. It is also convenient for the user that you don't have to remove agents from the space, the scheduler, and whatever else there might be. However, if the performance overhead is so large, I am wondering whether the cure is not worse than the problem.

Jan 24 '24 14:01 quaquel

So, I ran into the slowness again. While playing with #1994, I moved Cell.agents to Cell._agents and turned Cell.agents into a property that returned an AgentSet. On WolfSheep, this doubles the runtime, which is just plain silly.

Jan 24 '24 18:01 quaquel

So the problem seems to be WeakkeyDictionary which is a pure python implementation. It might be possible to substantially improve our speed without losing the use of weakrefs.

Jan 24 '24 18:01 quaquel

A quick update: I looked into the current implementation of WeakkeyDict. It is basically a clever wrapper around a normal dict with weakkey.ref() keys. Some minor speedups may be possible because the code uses some slightly dated constructions and has protection for multithreading (which we don't need). I am going to test this over the weekend. Since it is normal Python code, I also see no reason why Cython could not be used. Again, I'll try to test over the weekend.

Jan 25 '24 16:01 quaquel

Another experiment. I changed the implementation of Agentset.shuffle in 2 ways: I shuffle the weakrefs themselves, and it defaults to inplace updating (quick dirty check, can be changed also within the schedulers. The first column below is the current code on main. I'll do a comparison against @Corvince version with a dict later this morning.

Model	Size	Init time [95% CI]	Run time [95% CI]
Schelling	small	🔵 -0.1% [-0.5%, +0.3%]	🟢 -8.0% [-8.1%, -7.9%]
Schelling	large	🔵 -1.2% [-2.6%, +0.3%]	🟢 -8.2% [-9.2%, -7.2%]
WolfSheep	small	🔵 -3.6% [-4.4%, -2.8%]	🟢 -19.9% [-20.2%, -19.5%]
WolfSheep	large	🟢 -56.3% [-63.0%, -43.8%]	🟢 -19.0% [-21.8%, -15.5%]
BoidFlockers	small	🔵 +0.9% [-0.6%, +2.6%]	🔵 -3.4% [-4.4%, -2.3%]
BoidFlockers	large	🔵 +0.1% [-1.3%, +1.5%]	🔵 -1.9% [-3.1%, -1.0%]

The new code is below. The key slowdown compared to a normal dict is that when iterating over a WeakkeyDict, you are turning all weakrefs into hard refs, and (in agentset) back again into weakrefs. This is rather inefficient as evidenced above.


    def shuffle(self, inplace: bool = True) -> AgentSet:
        """
        Randomly shuffle the order of agents in the AgentSet.

        Args:
            inplace (bool, optional): If True, shuffles the agents in the current AgentSet; otherwise, returns a new shuffled AgentSet. Defaults to False.

        Returns:
            AgentSet: A shuffled AgentSet. Returns the current AgentSet if inplace is True.
        """
        weakrefs = list(self._agents.keyrefs())
        self.random.shuffle(weakrefs)

        if inplace:
            self._agents.data = {entry:None for entry in weakrefs}
            return self
        else:
            return AgentSet((agent for ref in weakrefs if (agent:=ref()) is not None), self.model)

Jan 26 '24 07:01 quaquel

Great work! Curious if we can get towards that 25%, or even more with Cython.

Jan 26 '24 08:01 EwoutH

For completeness: dict (@Corvince version) against updated shuffle. So, there is still a performance loss for using WeakkeyDict, but updating the shuffle can help a lot. I am not sure why large Schelling is not performant. I am also curious why the init times are so much better for a dict.

Model	Size	Init time [95% CI]	Run time [95% CI]
Schelling	small	🟢 -17.3% [-17.7%, -16.7%]	🟢 -16.7% [-17.0%, -16.4%]
Schelling	large	🟢 -19.8% [-20.4%, -19.1%]	🟢 -24.1% [-24.9%, -23.3%]
WolfSheep	small	🟢 -18.5% [-19.3%, -17.8%]	🟢 -5.2% [-5.7%, -4.7%]
WolfSheep	large	🟢 -17.6% [-18.8%, -16.1%]	🟢 -13.2% [-15.4%, -11.3%]
BoidFlockers	small	🟢 -7.9% [-9.2%, -6.8%]	🔵 -0.9% [-1.5%, -0.4%]
BoidFlockers	large	🟢 -7.1% [-8.5%, -5.6%]	🔵 -0.6% [-1.3%, +0.2%]

Jan 26 '24 08:01 quaquel

With #2007 merged and my suggested change to shuffle, we are getting close. Here, I compare the dict version to the current main with the updated shuffle implementation. As can be seen, we now lose about 15% with weakrefs. We effectively have halved the performance difference on the large models.

Model	Size	Init time [95% CI]	Run time [95% CI]
Schelling	small	🟢 -19.5% [-20.2%, -18.7%]	🔵 +1.1% [+0.5%, +1.8%]
Schelling	large	🟢 -21.5% [-22.6%, -20.6%]	🟢 -12.0% [-16.3%, -8.0%]
WolfSheep	small	🟢 -18.4% [-19.0%, -17.7%]	🟢 -5.4% [-5.9%, -4.8%]
WolfSheep	large	🟢 -17.0% [-18.5%, -15.3%]	🟢 -12.5% [-14.2%, -10.9%]
BoidFlockers	small	🟢 -8.6% [-10.2%, -7.0%]	🔵 -1.0% [-1.8%, -0.3%]
BoidFlockers	large	🟢 -7.0% [-8.5%, -5.5%]	🔵 -1.8% [-3.0%, -0.4%]

Jan 26 '24 19:01 quaquel

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
Schelling	small	🟢 -22.7% [-23.0%, -22.2%]	🔵 -2.6% [-2.9%, -2.4%]
Schelling	large	🟢 -29.8% [-39.6%, -24.6%]	🔵 -1.6% [-2.7%, -0.4%]
WolfSheep	small	🟢 -26.4% [-26.7%, -26.1%]	🟢 -5.6% [-5.8%, -5.5%]
WolfSheep	large	🟢 -28.8% [-35.9%, -24.9%]	🟢 -5.8% [-7.9%, -3.5%]
BoidFlockers	small	🟢 -6.6% [-7.1%, -6.1%]	🔵 -0.8% [-1.6%, +0.0%]
BoidFlockers	large	🟢 -8.4% [-9.0%, -7.8%]	🔵 -1.1% [-1.7%, -0.4%]

Feb 24 '24 15:02 github-actions[bot]

After me being stupid for a a bit, I got the benchmarks to run again. So the performance difference of dict compared to weakkeydict is still there but it is a lot smaller than it was before.

Feb 24 '24 15:02 quaquel

Yes I think the difference is negligible at this point. I think we can close this PR then, do you agree?

Feb 24 '24 16:02 Corvince