feat: Cythonize get_cell_list_contents
Requirements so far for migrating to Cythonized space.py:
- No
from __future__ import annotations - No walrus operators
- No custom decorators overriding method definitions
Maybe we should just skip cythonizing the existing space.py, and just straight to cythonizing #1994.
This is the error message due to custom decorator:
error: subprocess-exited-with-error
× Preparing metadata (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [50 lines of output]
[cython]
pre-build artifacts
Building c/c++ extensions...
['./mesa/space.pyx']
cythonize exited non null status 1
Error compiling Cython file:
------------------------------------------------------------
...
for x, y in cell_list:
cell = self._grid[x][y]
if cell != default_val:
yield cell
@accept_tuple_argument
^
------------------------------------------------------------
mesa/space.pyx:393:4: Cdef functions cannot take arbitrary decorators.
Compiling ./mesa/space.pyx because it changed.
[1/1] Cythonizing ./mesa/space.pyx
Traceback (most recent call last):
File "/tmp/tmpr6ja8kfv/setup.py", line 20, in <module>
ext_modules = cythonize(
^^^^^^^^^^
Thanks for this effort. It might indeed be interesting how #1994 develops.
Requirements so far for migrating to Cythonized space.py:
- No
from __future__ import annotations- No walrus operators
- No custom decorators overriding method definitions
I think thats really not too bad at all and I think the changes in this file are only minimal. Hope you get this to work
@quaquel was enthusiastic about this effort and wanted to port some parts to Cyhton once the whole built-pipeline was up and running.
He also suggested that might be an interesting NumFocus / GSoC idea.
For me, the most important first step is moving to meson and ensuring we can build the code for the various platforms, etc. Once that machinery is in place, we can start experimenting.
In my view, the long-term goal is to move all performance-critical parts to Cython. However, we are simultaneously also rebuilding various parts (schedulers, spaces). And we are still debating other parts like data collection. So it is a bit of a shame to port stuff that will be replaced anyway. But it is also potentially annoying to have new experimental features only in cython because these parts might change, and it is less accessible for anyone running into an issue and willing to investigate.
In my experience, it is not that much work to turn good python code into cython code. So writing experimental parts in cython is fine for me as a developer. I am less sure about the user implications. We might also just test it with for example the new grid stuff and see what we learn.
another option that I have seen is that libraries offer two versions of essentially the same code. A slow Python version and a fast c/cython/c++ version. The drawback is that in induced code duplication.
So, no clear conclusion for me. Just some choices that have to be made.
For me, the most important first step is moving to meson and ensuring we can build the code for the various platforms, etc. Once that machinery is in place, we can start experimenting.
I stick to using Hatch here because:
- It already works in this PR. And so, the pyproject.toml diff can be reused in another separate PR that needs Cython experimentation.
- migrating to Meson is a bit more involved (e.g. needs a meson.build in the root repo and in mesa/ folder), and I would need help from GPT-4.
- I don't want to block the Cython experimentation, so that it can happen sooner.
So it is a bit of a shame to port stuff that will be replaced anyway. But it is also potentially annoying to have new experimental features only in cython because these parts might change, and it is less accessible for anyone running into an issue and willing to investigate.
I suppose it is safer & for exercise purpose to Cythonize the stable space.py code, given that it is unlikely to change except for bugfixes. Even with the old space.py, we can still experiment with using C++ STL map to avoid GIL, and hence be faster than Cython dict.
another option that I have seen is that libraries offer two versions of essentially the same code. A slow Python version and a fast c/cython/c++ version. The drawback is that in induced code duplication.
I wouldn't want code duplication unless necessary. It's going to be hard to maintain.
just a quick clarifying question: do you plan to move to meson in the longer run?
just a quick clarifying question: do you plan to move to meson in the longer run?
Yes, personally, when I have the time to properly port the packaging system.
Though we probably need a comprehensive reasoning why it has to be Meson, for a consensus from the @projectmesa/maintainers. My reason to pick Hatch at the time: https://github.com/projectmesa/mesa/pull/1882#issue-1999889086.
I don't know enough about build systems to make an informed decision about this.
The story behind SciPy's move to Meson: https://labs.quansight.org/blog/2021/07/moving-scipy-to-meson. The RFC itself.
From the previous link:
they can look forward to much faster builds and a more pleasant development experience. So how fast is it? Currently the build takes about 1min 50s (a ~4x improvement) on my 3 year old 12-core Intel CPU (i9-7920X @ 2.90GHz): ... When I started this endeavour, I wasn't yet completely sure using Meson was going to work out. However I did know that if building SciPy was successful, Meson will work for pretty much any other scientific Python project. ... There are a lot of technical reasons to argue for or against using any build system. However, why I'm convinced Meson is a great choice for both SciPy and other projects is: Meson is a pleasure to use.
That was a nice and helpful read. Like @EwoutH, build systems are not my area of expertise. But the fact that some of the big libraries in the scientific computing python ecosystem use meson (e.g., scipy, numpy, pandas) speaks in favor of using it even while our needs are nowhere as complicated as for those libraries.
the fact that some of the big libraries in the scientific computing python ecosystem use meson (e.g., scipy, numpy, pandas) speaks in favor of using it
Will read up later, but agreed! Best practices are best practices for a reason.
New commits are about improving the typing annotation because Cython is stricter, and removing the @overload decorator, due to this error message
args = (<mesa.space.SingleGrid object at 0x7f39716c7f90>, (0, 1)), kwds = {}
def _overload_dummy(*args, **kwds):
"""Helper for @overload to raise when called."""
> raise NotImplementedError(
"You should not call an overloaded function. "
"A series of @overload-decorated functions "
"outside a stub module should always be followed "
"by an implementation that is not @overload-ed.")
E NotImplementedError: You should not call an overloaded function. A series of @overload-decorated functions outside a stub module should always be followed by an implementation that is not @overload-ed.
This PR can only be merged if support for 3.9 is dropped, and so it depends on #2003.
Forgot to say that tests have passed for Python >= 3.10.
I want to dig into this in the near future. In the meantime, do you have any idea about the performance improvements?
There shouldn't be a much performance improvement of get_cell_list_contents yet (it may be only marginally faster). The actual fast version of it is
cpdef long[:, :] convert_tuples_to_mview(self, object cell_list):
cdef long x, y
cdef long[:, :] tuples_mview
length = len(cell_list)
tuples_mview = np.ndarray((length, 2), long)
for i in range(length):
pos = cell_list[i]
x, y = pos[0], pos[1]
tuples_mview[i, 0], tuples_mview[i, 1] = x, y
return tuples_mview
cpdef object[:] get_cell_mview_contents(self, long[:, :] tuples_mview):
cdef long default_val
cdef int count
cdef object[:] agent_mview
cdef long x, y
length = len(tuples_mview)
agent_mview = np.ndarray(length, object)
count = 0
default_val = self._default_val_ids()
for i in range(length):
x, y = tuples_mview[i, 0], tuples_mview[i, 1]
id_agent = self._ids_grid[x, y]
if id_agent == default_val:
continue
agent_mview[count] = self._agents_grid[x, y]
count += 1
return agent_mview[:count]
cpdef get_cell_list_contents(self, object cell_list):
tuples_mview = self.convert_tuples_to_mview(cell_list)
agent_mview = self.get_cell_mview_contents(tuples_mview)
return self.convert_agent_mview_to_list(agent_mview)
which is not in this PR. Here I limit the scope to preparing a Cython setup.
The benchmark failed because it didn't run a pip install . / cythonize -i mesa/space.pyx, and so mesa.space is missing.
Local benchmark result
| Model | Size | Init time [95% CI] | Run time [95% CI] |
|---|---|---|---|
| Schelling | small | 🔴 +9.5% [+9.1%, +10.0%] | 🟢 -11.0% [-11.4%, -10.6%] |
| Schelling | large | 🔴 +8.2% [+6.4%, +10.1%] | 🟢 -6.2% [-6.8%, -5.7%] |
| WolfSheep | small | 🔴 +3.4% [+3.1%, +3.9%] | 🔵 -0.3% [-0.5%, -0.1%] |
| WolfSheep | large | 🔵 +25.9% [+2.2%, +61.8%] | 🔵 -1.1% [-3.0%, +0.5%] |
| BoidFlockers | small | 🔵 +2.1% [+1.6%, +2.5%] | 🟢 -11.4% [-12.0%, -10.8%] |
| BoidFlockers | large | 🔵 +2.3% [+1.7%, +2.8%] | 🟢 -16.1% [-16.7%, -15.5%] |
The benchmark failed because it didn't run a pip install . / cythonize -i mesa/space.pyx, and so mesa.space is missing.
@EwoutH you could probably help debug this faster. I tried at the commit "benchmark: Install main & PR branch of Mesa explicitly", but not sure if it is the right approach.
The commit probably works, but it only takes effect after it is merged due to the way pull_request_target works. So could you do this in a separate PR? Once the other PR is merged you can rerun the benchmark on this one
Performance benchmarks:
| Model | Size | Init time [95% CI] | Run time [95% CI] |
|---|---|---|---|
| Schelling | small | 🔴 +12.4% [+12.0%, +12.8%] | 🔴 +38.6% [+38.4%, +38.7%] |
| Schelling | large | 🟢 -32.4% [-48.7%, -13.5%] | 🔴 +33.6% [+33.2%, +34.1%] |
| WolfSheep | small | 🔴 +7.3% [+7.0%, +7.6%] | 🔴 +30.2% [+30.0%, +30.4%] |
| WolfSheep | large | 🔵 +9.7% [-17.8%, +40.0%] | 🔴 +30.9% [+28.7%, +33.1%] |
| BoidFlockers | small | 🔵 +1.8% [+1.4%, +2.2%] | 🟢 -4.7% [-5.1%, -4.3%] |
| BoidFlockers | large | 🔵 +0.8% [+0.5%, +1.0%] | 🟢 -10.4% [-10.9%, -10.0%] |
Closing in favor of using Rust instead.