mesa icon indicating copy to clipboard operation
mesa copied to clipboard

Deprecate seed in favor of rng

Open quaquel opened this issue 1 month ago • 16 comments

This PR addressed #2884.

It does several things

  1. Adds a decorator for annotating methods that currently take seed or random as a kwarg. The decorator results in a FutureWarning when seed or random is being used instead of the new preferred rng kwarg. I use. FutureWarning because this is intended for end users (See https://docs.python.org/3/library/warnings.html).
  2. Adds this decorator to all methods that have seed or random as a kwarg
  3. Starts adding rng to all decorated classes and update their inner workings to use numpy generators instead of stdlib random generators.

reminder we need to fix the set_seed widget on the solara side as well!

quaquel avatar Nov 18 '25 09:11 quaquel

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔵 +1.8% [+0.6%, +3.0%] 🔵 +0.2% [+0.0%, +0.4%]
BoltzmannWealth large 🔵 -0.4% [-1.4%, +0.5%] 🔵 -1.2% [-2.8%, +0.4%]
Schelling small 🔵 -0.6% [-0.8%, -0.3%] 🔵 +0.3% [-0.0%, +0.6%]
Schelling large 🔵 -1.0% [-1.5%, -0.6%] 🔵 -2.4% [-3.7%, -1.0%]
WolfSheep small 🔵 +0.7% [+0.3%, +1.0%] 🔵 +1.0% [+0.8%, +1.1%]
WolfSheep large 🔵 +0.6% [-0.8%, +2.1%] 🔵 +2.1% [+0.5%, +3.5%]
BoidFlockers small 🔵 +1.5% [+0.5%, +2.4%] 🔵 +0.9% [+0.7%, +1.0%]
BoidFlockers large 🔵 +0.5% [-0.2%, +1.0%] 🔵 +1.0% [+0.7%, +1.3%]

github-actions[bot] avatar Nov 18 '25 10:11 github-actions[bot]

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔴 +20.0% [+18.9%, +21.1%] 🟢 -3.8% [-4.0%, -3.6%]
BoltzmannWealth large 🔴 +13.5% [+10.8%, +15.2%] 🔴 +10.9% [+8.8%, +13.2%]
Schelling small 🔴 +21.2% [+20.7%, +21.8%] 🔵 -1.9% [-2.1%, -1.6%]
Schelling large 🔴 +13.8% [+13.0%, +14.7%] 🔴 +6.4% [+4.5%, +7.9%]
WolfSheep small 🔴 +16.5% [+15.9%, +17.1%] 🔴 +20.5% [+13.5%, +27.8%]
WolfSheep large 🔴 +19.6% [+18.7%, +20.4%] 🔴 +33.0% [+30.8%, +35.0%]
BoidFlockers small 🔵 +0.8% [+0.4%, +1.3%] 🔵 -0.4% [-0.6%, -0.2%]
BoidFlockers large 🔵 +1.0% [+0.6%, +1.4%] 🔵 +0.0% [-0.3%, +0.4%]

github-actions[bot] avatar Nov 18 '25 12:11 github-actions[bot]

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔴 +16.1% [+15.3%, +17.0%] 🟢 -4.4% [-4.6%, -4.3%]
BoltzmannWealth large 🔴 +12.8% [+11.0%, +13.8%] 🔵 -0.5% [-0.7%, -0.3%]
Schelling small 🔴 +20.8% [+20.6%, +20.9%] 🟢 -5.3% [-5.4%, -5.1%]
Schelling large 🔴 +15.5% [+15.2%, +15.8%] 🔵 -1.6% [-2.1%, -1.2%]
WolfSheep small 🔴 +15.0% [+14.6%, +15.3%] 🔴 +20.3% [+13.6%, +27.8%]
WolfSheep large 🔴 +15.4% [+15.0%, +15.8%] 🔴 +21.2% [+20.5%, +22.1%]
BoidFlockers small 🔵 -1.0% [-1.4%, -0.6%] 🔵 -0.7% [-0.9%, -0.5%]
BoidFlockers large 🔵 -1.5% [-1.8%, -1.1%] 🔵 -0.9% [-1.2%, -0.7%]

github-actions[bot] avatar Nov 18 '25 15:11 github-actions[bot]

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔴 +15.2% [+13.8%, +16.6%] 🔴 +65.9% [+65.5%, +66.2%]
BoltzmannWealth large 🔴 +5.5% [+4.9%, +6.1%] 🔴 +56.0% [+54.0%, +58.1%]
Schelling small 🔴 +7.3% [+7.1%, +7.5%] 🔵 -1.7% [-1.8%, -1.5%]
Schelling large 🔵 +2.7% [+2.2%, +3.2%] 🔴 +78.0% [+76.6%, +79.2%]
WolfSheep small 🔴 +24.2% [+23.6%, +24.7%] 🔴 +35.6% [+28.2%, +43.3%]
WolfSheep large 🔴 +22.6% [+21.5%, +23.5%] 🔴 +31.1% [+29.4%, +32.5%]
BoidFlockers small 🔵 +0.4% [-0.4%, +1.2%] 🔵 +0.2% [-0.1%, +0.5%]
BoidFlockers large 🔵 +1.0% [+0.4%, +1.7%] 🔵 +0.8% [+0.6%, +1.0%]

github-actions[bot] avatar Nov 18 '25 19:11 github-actions[bot]

As can be seen in the performance benchmarks, shifting from stdlib random to numpy.random.default_rng comes with a performance hit. I dug into this a bit more. The two main random calls that are made in these benchmark models are shuffle and choice. np.random.default_rng().shuffle() is about 20% faster than random.shuffle(). However, np.random.default_rng().choice is several orders of magnitude slower than random.choice(). In fact, for the specific case of choosing a single item randomly from a collection, it is advised to use collection[rng.integers(0, len(collection)] instead, but this is still twice as expensive as random.choice.

More broadly, numpy.random.default_rng() is designed for generating and operating on numpy arrays. So, calls to create a single random number (e.g., rng.random, rng.integers) are all much slower than their stdlib random alternatives. However, once the size becomes larger, they become increasingly faster than a list expression with stdlib random. So, shifting to rng comes with additional implementation considerations to work around this performance hit. See some of the updated examples for what that can look like.

Thoughts are welcome. In your opinion, is the improved quality of the random number generators and the fact that you have local state with numpy worth the performance hit?

quaquel avatar Nov 19 '25 10:11 quaquel

Given how common it is to draw a single random number (i.e., random.random(), I was wondering about adding a shorthand for this and see how fast this was.

def test_rand(rng, initial_size=100):
    initial_values = rng.random(size=initial_size)

    while True:
        for entry in initial_values:
            yield entry
        initial_value = rng.random(size=initial_size)

def draw():
    return next(a)

a = test_rand(rng, 200)  

And then I timed these as shown below

%%timeit
[random.random() for _ in range(250)]
8.88 μs ± 72.4 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

%%timeit
[draw() for _ in range(250)]
20.7 μs ± 130 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%%timeit
[next(a) for _ in range(250)]
16.4 μs ± 121 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

%%timeit
[rng.random() for _ in range(250)]
67.9 μs ± 273 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

So, a naive shift from random.random to rng.random, is almost an order of magnitude slower. We can effectively halve this by using the generator and advancing this generator directly via next, wrapping this next call up in another function adds some overhead, but it is still a lot faster than a naive replacement. So, I am considering adding a method to the model, model.rand, that performs this task. The other option would be to figure out how to subclass numpy.random.Generator and add this rand method to it, because then you can just do self.rng.rand as a much faster alternative for rng.random().

quaquel avatar Nov 20 '25 10:11 quaquel

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔴 +17.0% [+15.3%, +18.7%] 🔴 +50.4% [+50.0%, +50.8%]
BoltzmannWealth large 🔴 +9.1% [+7.3%, +10.5%] 🔴 +60.8% [+55.1%, +65.5%]
Schelling small 🔴 +16.5% [+16.1%, +16.9%] 🔵 +3.6% [+2.9%, +4.3%]
Schelling large 🔴 +10.6% [+9.0%, +12.3%] 🔴 +80.4% [+70.8%, +91.3%]
WolfSheep small 🔴 +6.0% [+5.4%, +6.5%] 🔴 +39.3% [+32.2%, +47.1%]
WolfSheep large 🔴 +6.6% [+5.5%, +7.9%] 🔴 +42.3% [+38.9%, +45.7%]
BoidFlockers small 🔵 +3.2% [+2.7%, +3.7%] 🔵 +1.9% [+1.6%, +2.3%]
BoidFlockers large 🔴 +3.6% [+3.2%, +3.9%] 🔵 +1.9% [+1.4%, +2.4%]

github-actions[bot] avatar Nov 20 '25 10:11 github-actions[bot]

More broadly, numpy.random.default_rng() is designed for generating and operating on numpy arrays.

I don't know if it's a good idea, but: Can you pre-generate an array (like a 100 or a 1000), use those numbers, and everytime you used up the array generate a new one? Basically generate random numbers in batch?

EwoutH avatar Nov 21 '25 09:11 EwoutH

I don't know if it's a good idea, but: Can you pre-generate an array (like a 100 or a 1000), use those numbers, and everytime you used up the array generate a new one? Basically generate random numbers in batch?

yes, that is exactly what I show here: https://github.com/projectmesa/mesa/pull/2888#issuecomment-3557219776

quaquel avatar Nov 21 '25 17:11 quaquel

@quaquel Your stuff is always great and this is no exception. I don't think I have anything substantive to add. From my perspective it is better to improve the rng even if it has a performance hit.

Then the next question becomes, are there ways to mitigate the performance hit while ensuring optimal randomness. Two options on the table:

  1. wrapper
  2. subclass

I think 1 is the best bet. Subclassing in python to a generator using c-extensions may create unexpected challenges that may have unintended impacts on user simulations.

At this point I would recommend 1, but always open to discussion!

tpike3 avatar Dec 09 '25 02:12 tpike3

Can we add a Model lever keyword argument, that defaults to using NumPy RNG but allows you to use random if you need te performance?

EwoutH avatar Dec 09 '25 04:12 EwoutH

I'll try to take another look at this somewhere this week. As a first step, I'll add the wrapper idea to the model class so you can just do model.rand if you need a single number. Next, I'll rerun the benchmarks to determine the performance difference.

@EwoutH, nothing prevents you from using stdlib.random in your own models in the future. If we decide to make model.rng the only option in MESA 4, you can still have MyCustomModel.random as well but you would have to add this to the model yourself.

quaquel avatar Dec 09 '25 06:12 quaquel

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔴 +12.8% [+11.6%, +13.9%] 🔴 +59.8% [+59.1%, +60.4%]
BoltzmannWealth large 🔵 +12.7% [-4.9%, +34.8%] 🔴 +29.8% [+27.0%, +32.9%]
Schelling small 🔴 +16.4% [+15.9%, +16.9%] 🔵 +2.8% [+2.0%, +3.5%]
Schelling large 🔵 -2.7% [-16.1%, +9.2%] 🔴 +102.2% [+91.0%, +118.3%]
WolfSheep small 🔴 +21.9% [+20.9%, +22.9%] 🔴 +47.3% [+36.3%, +58.0%]
WolfSheep large 🔵 +34.3% [-0.6%, +107.5%] 🔵 +13.6% [-8.5%, +41.7%]
BoidFlockers small 🔵 +0.2% [-0.4%, +0.8%] 🔵 -0.9% [-1.2%, -0.5%]
BoidFlockers large 🔵 +0.5% [+0.1%, +1.0%] 🔵 +0.2% [-0.0%, +0.4%]

github-actions[bot] avatar Dec 09 '25 11:12 github-actions[bot]

The benchmarks are a bit funky via github actions, so I ran them locally. The results are shown below. The TLDR from my perspective is that the performance loss is now acceptable and the benefits of using numpy's random number generation outweight the loss of performance.

This is the benchmarking with a new model.rand method for generating individual random numbers on unit interval. Internally, this wraps a numpy array of a prespecified length that is replaced once exhausted. For further details, see my comment with the code above. The model.rand method is only used by the WolfSheep benchmark, but it significantly reduces the performance loss compared to a naive use of rng.random().

There is still some performance loss as can be seen, but this is more manageable. Moreover, some of this is likely due to the overhead of the deprecation decorator I added as part of this PR. I'll run a local benchmark later, removing this decorator, to get a clearer sense of the performance loss.

/opt/anaconda3/bin/python /Users/jhkwakkel/Documents/GitHub/mesa/benchmarks/compare_timings.py

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔴 +10.1% [+9.0%, +11.0%] 🔴 +33.9% [+33.4%, +34.2%]
BoltzmannWealth large 🔴 +9.6% [+9.2%, +10.0%] 🔴 +27.2% [+24.8%, +29.0%]
Schelling small 🔴 +15.5% [+14.9%, +16.0%] 🔵 -1.0% [-2.0%, -0.1%]
Schelling large 🔴 +12.1% [+11.5%, +12.5%] 🔴 +62.7% [+59.2%, +65.4%]
WolfSheep small 🔴 +16.5% [+15.8%, +17.3%] 🔴 +30.4% [+23.3%, +37.7%]
WolfSheep large 🔴 +15.1% [+14.9%, +15.4%] 🔴 +25.0% [+22.6%, +26.6%]
BoidFlockers small 🔵 +0.5% [-0.1%, +1.1%] 🔵 -0.3% [-0.5%, -0.1%]
BoidFlockers large 🔵 -2.0% [-4.3%, -0.1%] 🔵 -1.1% [-2.5%, -0.1%]

quaquel avatar Dec 10 '25 08:12 quaquel

I'll run a local benchmark later, removing this decorator, to get a clearer sense of the performance loss.

Curious for this. Because 25 to 60% runtime is not trivial.

EwoutH avatar Dec 10 '25 08:12 EwoutH

Curious for this. Because 25 to 60% runtime is not trivial.

It's a bit tricky to directly compare the runtime of the old and new cases for both wolf sheep and Schelling. In the case of Schelling, if everyone is happy, no further moves take place, and thus the model runs much faster. In the case of wolf sheep, if all the sheep die quickly, again the model runs much faster afterwards. Since the random numbers are different, the fraction of models that reach this fast "state" might very well be different.

Based on my testing of individual operations (e.g., random.shuffle vs. rng.shuffle), there is a performance loss of about 30%-50% on these individual operations. How that adds up in a bigger model is more difficult to assess because these individual operations are just a small part of the overall model. In many ways, Boltzmann might be the best one to look at because this model has virtually no overhead from agent logic.

Additionally, I would like to delve a bit deeper by conducting some line profiling to identify where performance is being lost exactly.

Accidentally, I just ran a quick benchmark with all the decorators removed, and the overhead of the deprecation decorator is just a few percent.

quaquel avatar Dec 10 '25 08:12 quaquel

@EwoutH, @tpike3 , I had a closer look at what is happening with the runtime. In short, there are 2 key chances here

  • random.choice --> some_collection[rng.integers(0, len(some_collection))]
  • random.shuffle --> rng.shuffle

There is also random.random, but that is not part of the core of Mesa and is only used in a single benchmark model, so we can ignore that for now.

To start with the good news, rng.shuffle is an order of magnitude faster than random.shuffle. If I test this change in isolation, the benchmark models run 5%-10% faster.

The problem is with replacing random.choice. A naive replacement with rng.choice, is almost 2 orders of magnitude slower (if I read these units correctly...).

%%timeit
random.choice(a)
181 ns ± 9.22 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

%%timeit
rng.choice(a)
7.07 μs ± 69.4 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

This is, in fact, a known issue. There are various issues related to this, both in the NumPy git repository and on Stack Overflow. One often made suggestion is to use rng.integers if you want to choose only a single random item from a collection. This does make a difference:

%%timeit
a[rng.integers(0, len(a))]
892 ns ± 4.18 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

So still almost an order of magnitude slower, but at least we are in the same time units again. This is the version I have used in this PR.

So how does all this relate to the reported benchmarks? Most of our benchmark models rely on random walks, random selections of agents, or random selection of a cell to move to. All of these rely on rng.choice/some_collection[rng.integers(0, len(some_collection))]. Thus, it is not surprising that the benchmarks are what they are. However, I would caution against over-interpreting this. Many real-world models in my experience don't rely as much on random.choice.

Additionally, I believe there may be room for further improvements. In particular, the random choice inside cell collections can probably be tuned a bit more by doing something similar to the earlier example of rng.random by wrapping rng.integers in a generator.

I remain in favor of proceeding with this PR. The proper control of random number generation, avoiding race conditions, and the higher quality of the random numbers are all crucial for the proper use of ABMs.

quaquel avatar Dec 16 '25 08:12 quaquel