mesa icon indicating copy to clipboard operation
mesa copied to clipboard

JupyterViz: Simulation step is very slow for huge grid size (e.g. 80x80)

Open rht opened this issue 2 years ago • 7 comments

This is a continuation of the discussion started in https://github.com/projectmesa/mesa/issues/1772#issuecomment-1713805080. @rlskoeser suggested Altair, which might be faster than Solara's Matplotlib backend.

I did manual benchmark. I found that on my laptop (i5-1345U), the portray generation and ax.scatter took 80 ms, but the slowest part is actually Solara's savefig: https://github.com/widgetti/solara/blob/a747a680478653ab73c3f9323aeb5fee45147b60/solara/components/matplotlib.py#L54, which took 1.2 s.

I changed the output format from ~~png to svg~~svg to png, and the savefig elapsed went down from 1.2 s to 180 ms. I additionally experimented with updating the scatter plot with set_offsets (for x, y), and set_sizes for size: the portrayal generation and scatter update went down from 80 ms to 6 ms. The branch I used to experiment can be found at https://github.com/rht/mesa/tree/solara_perf.

This is very promising.

rht avatar Sep 16 '23 06:09 rht

According to https://discuss.streamlit.io/t/plot-library-speed-trial/4688, Altair is orders of magnitude faster than Matplotlib.

rht avatar Sep 16 '23 06:09 rht

Altair is also suitable for interactive explorations, where matplotlib only displays a static image in solara. (https://solara.dev/examples/libraries/altair)

Also we don't have to worry about thread safety since the actual plotting of altair happens in javascript (vega)

Corvince avatar Sep 16 '23 07:09 Corvince

Here is some code for a very simple altair grid chart

def altair_space(model, test):
    def get_data(agent, pos):
        if agent:
            return {"x": pos[0], "y": pos[1], "type": agent.type}

    data = list(
        filter(None, (get_data(agent, pos) for agent, pos in model.grid.coord_iter()))
    )
    chart = (
        alt.Chart(alt.Data(values=data))
        .mark_rect()
        .encode(x="x:O", y="y:O", color="type:N")
    )
    return solara.FigureAltair(chart)

But it does appear to be slower than Matplotlib :( Could you test this out @rht ?

/edit to be used only for the schelling example

Corvince avatar Sep 20 '23 20:09 Corvince

According to https://discuss.streamlit.io/t/plot-library-speed-trial/4688, Altair is orders of magnitude faster than Matplotlib.

It might have some issues with Jupyter. But yes the graphics are very rich, especially you can have a lots of information displayed via tooltip.

ankitk50 avatar Sep 26 '23 14:09 ankitk50

Just tested. Altair is no-go for now. At first, I got this error

File /venv/lib/python3.11/site-packages/altair/utils/data.py:81, in limit_rows.<locals>.raise_max_rows_error()
     80 def raise_max_rows_error():
---> 81     raise MaxRowsError(
     82         "The number of rows in your dataset is greater "
     83         f"than the maximum allowed ({max_rows}).\n\n"
     84         "Try enabling the VegaFusion data transformer which "
     85         "raises this limit by pre-evaluating data\n"
     86         "transformations in Python.\n"
     87         "    >> import altair as alt\n"
     88         '    >> alt.data_transformers.enable("vegafusion")\n\n'
     89         "Or, see https://altair-viz.github.io/user_guide/large_datasets.html "
     90         "for additional information\n"
     91         "on how to plot large datasets."
     92     )

MaxRowsError: The number of rows in your dataset is greater than the maximum allowed (5000).

Try enabling the VegaFusion data transformer which raises this limit by pre-evaluating data
transformations in Python.
    >> import altair as alt
    >> alt.data_transformers.enable("vegafusion")

Or, see https://altair-viz.github.io/user_guide/large_datasets.html for additional information
on how to plot large datasets.

Then I did pip install -U "vegafusion[embed]" vl-convert-python, then got this Solara error

Traceback (most recent call last):
  File "/code/venv/lib/python3.11/site-packages/reacton/core.py", line 1661, in _render
    root_element = el.component.f(*el.args, **el.kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/code/venv/lib/python3.11/site-packages/solara/components/figure_altair.py", line 31, in FigureAltair
    raise KeyError(f"{key4} and {key5} not in mimebundle:\n\n{bundle}")
KeyError: 'application/vnd.vegalite.v4+json and application/vnd.vegalite.v5+json not in mimebundle:\n\n{\'application/vnd.vega.v5+json\': {\'$schema\': \'https://vega.github.io/schema/vega/v5.json\', \'data\': [{\'name\': \'source_0\', \'values\': [{\'x\': 0, \'y\': 0}, {\'x\': 0, \'y\': 3}, {\'x\': 0, \'y\': 4}, {\'x\': 0, \'y\': 5}, ..., 73}, {\'x\': 79, \'y\': 74}, {\'x\': 79, \'y\': 75}, {\'x\': 79, \'y\': 76}, {\'x\': 79, \'y\': 77}, {\'x\':79, \'y\': 78}, {\'x\': 79, \'y\': 79}]}, {\'name\': \'source_0_x_domain_x\', \'values\': [{\'min\': 0, \'max\': 79}]}, {\'name\': \'source_0_y_domain_y\', \'values\': [{\'min\': 0, \'max\': 79}]}], \'marks\': [{\'type\': \'symbol\', \'name\': \'marks\', \'from\': {\'data\': \'source_0\'}, \'encode\': {\'update\': {\'y\': {\'field\': \'y\', \'scale\': \'y\'}, \'ariaRoleDescription\': {\'value\': \'point\'}, \'x\': {\'field\': \'x\', \'scale\': \'x\'}, \'opacity\': {\'value\': 0.7}, \'fill\': {\'value\': \'#4c78a8\'}, \'description\': {\'signal\': \'"x: " + (format(datum["x"], "")) + "; y: " + (format(datum["y"], ""))\'}}}, \'style\': [\'point\']}], \'scales\': [{\'name\': \'x\', \'type\': \'linear\', \'domain\': [{\'signal\': \'(data("source_0_x_domain_x")[0] || {}).min\'}, {\'signal\': \'(data("source_0_x_domain_x")[0] || {}).max\'}], \'range\': [0, {\'signal\': \'width\'}], \'zero\': True, \'nice\': True}, {\'name\': \'y\', \'type\': \'linear\', \'domain\': [{\'signal\': \'(data("source_0_y_domain_y")[0] || {}).min\'}, {\'signal\': \'(data("source_0_y_domain_y")[0] || {}).max\'}], \'range\': [{\'signal\':\'height\'}, 0], \'zero\': True, \'nice\': True}], \'style\': \'cell\', \'padding\': 5, \'width\': 300, \'height\': 300, \'background\': \'white\'}, \'text/plain\': \'<VegaLite 5 object>\\n\\nIf you see this message, it means the renderer has not been properly enabled\\nfor the frontend that you are using. For more information, see\\nhttps://altair-viz.github.io/user_guide/display_frontends.html#troubleshooting\\n\'}'

rht avatar Jan 28 '24 19:01 rht

You got me curious. This page talks about this: https://altair-viz.github.io/user_guide/large_datasets.html

However, I tried it in mesa-interactive and it handled it just fine, so it seems to be solvable. For performance see the screencast

e6497a6d-aa3d-42cf-b626-e824c290ca11.webm

Corvince avatar Jan 28 '24 20:01 Corvince

There is not much different in the code other than:

  • mesa-interactive uses mark_rect()
  • that I haven't implemented on_click
  • I don't specify scale for x and y
  • I specify the type as type="ordinal" instead of "y:N" for clarity

It could be my own install is problematic (I am using pip instead of conda). But at least it seems that for huge grid, the answer is to use Altair.

rht avatar Jan 29 '24 02:01 rht

It seems fixing #1741 made it faster for both Matplotlib and Altair. Currently, the Altair version on main works even without the vegafusion acceleration, but the Matplotlib version of the space drawer is faster than Altair's, so I am leaving Matplotlib as the default, and am closing this issue.

rht avatar Mar 05 '24 13:03 rht