matplotlib-pyodide
matplotlib-pyodide copied to clipboard
Matplotlib backend in a web worker
At the moment using matplotlib when Pyodide is loaded in a web worker with the following code snippet:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 1000)
plt.plot(x, np.sin(x));
plt.show()
Gives the following error as it tries to create new elements with the wasm backend:
Traceback (most recent call last):
File "<console>", line 2, in <module>
File "/lib/python3.8/site-packages/matplotlib/pyplot.py", line 2336, in <module>
switch_backend(rcParams["backend"])
File "/lib/python3.8/site-packages/matplotlib/pyplot.py", line 276, in switch_backend
class backend_mod(matplotlib.backend_bases._Backend):
File "/lib/python3.8/site-packages/matplotlib/pyplot.py", line 277, in backend_mod
locals().update(vars(importlib.import_module(backend_name)))
File "/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/lib/python3.8/site-packages/matplotlib/backends/wasm_backend.py", line 23, in <module>
from js import document
ImportError: cannot import name 'document' from 'js' (unknown location)
There is already a "Caveats" section in the docs mentioning limitations when using Pyodide in a web worker: https://pyodide.org/en/latest/usage/webworker.html#caveats
Maybe there could be another backend for matplotlib that would work in a web worker. Or document a workaround (if that's possible).
This is indeed possible. I have a plan involving comlink. If anyone is interested in implementing this, I can explain what I'm thinking.
I think we should also open an upstream issue at matplotlib, see if there is any interest there and try to get them involved.
@madhur-tandon Might also have some perspective on this.
I think the point here is to use the existing implementation as-is but after wrapping dom calls in Comlink proxy. Though it occurs to me that we should just wrap the entire DOM with Comlink and then code on a webworker can transparently use all of the DOM calls, with the caveat that they all become async.
Once we implement pyodide/pyodide#1503, we can remove the async caveat. (I've already started working on pyodide/pyodide#1503, @joemarshall did the hard part, we just have to figure out how to integrate it properly.)
Right, but we have our own matplotlib backend and before doing much more development work on it, I think it would be good to discuss what's the future of it. Are we going to have to maintain it forever or if there is a possibility to upstream at least part of it.
It's indeed a bit orthogonal to the above technical discussion, but it would still be good to have a long term plan for this matplotlib backend. At least it would be good if someone from matplotlib was vaguely following these discussions.
So I think the point here is this though: matplotlib has an API that consumes user input events and produces an image. The DOM is an API that produces user input events and consumes images. All that really needs to be done in the matplotlib backend is wiring it together. It sorta seems to me like this is most naturally our responsibility in Pyodide.
Hi, not sure if this helps but long ago I implemented an HTML5 <canvas> based backend for matplotlib. It's available on the GSoC branch in Pyodide. Instead of requesting the Agg renderer (compiled to WASM) to draw a plot, and pasting it's screenshot on the web document -- it used to live render graphics directly on the web document using the <canvas> tag. I am not sure if it will work inside a web worker since it also has the import statement: from js import document in it. But, one advantage of using it is that we can remove the AGG renderer from the pipeline decreasing the size of matplotlib. But, it can be a bit slower too.
Anyway, let me know if this is of use here and I can revive it back for the current Pyodide. Thanks!
Anyway, let me know if this is of use here and I can revive it back for the current Pyodide.
I think you are better equipped to judge than we are, I personally don't know anything about matplotlib. I'd be interested to discuss further after the release.
Thanks for the input @madhur-tandon ! I also saw somewhere a mention of HTML5 backend being one of the possible topics for matplotlib GSoC this year (unless I'm mixing up something). So I think after the release we could reach out to matplotlib devs what would be the best way to move forward with these 2 backends that we have and their planned roadmap. Clearly communication with the webworker in Pyodide would still be on us though :)
Yeah it would be great if we could upstream an exact match on the HTML5 API.
Anyway, let me know if this is of use here and I can revive it back for the current Pyodide.
@madhur-tandon If you could sync it with main and make a WIP PR so it's more visible, that would be great in any case!
@rth For the matplotlib GSoC project this year, I was one of the mentors :)
But we didn't find a student for it unfortunately, since it wasn't advertised early enough.
That is also a Canvas based backend renderer yes, but it's based on ipycanvas. So it has this extra layer of using ipycanvas.
The one which I made uses <canvas> directly (from the DOM) using Pyodide itself.
I shall try to sync it with main and make a WIP PR soon.
Thanks!
I think the proper interface for this is roughly as follows: we should make a set_frontend method on FigureCanvasWasm which takes a function,
set_frontend(add_front_end)
The argument add_front_end is an async function (I mean it can return any awaitable, not necessarily a coroutine)
async def add_front_end(listeners)
listeners would be for events like render, rubberband-mousemove, toolbar-button-click, and download.
You would have to call set_frontend before showing the plot.
This logic would then not have to care about workers at all, it would be the responsibility of the front end implementer.
@madhur-tandon Does this sound reasonable?
Nice, thanks all for the comments and ideas for future implementations :+1:
For now if someone is reading this thread and would like a basic workaround, the following code snippet patching matplotlib.pyplot.show might help:
import base64
import os
from io import BytesIO
os.environ['MPLBACKEND'] = 'AGG'
import matplotlib.pyplot
def ensure_matplotlib_patch():
_old_show = matplotlib.pyplot.show
def show():
buf = BytesIO()
matplotlib.pyplot.savefig(buf, format='png')
buf.seek(0)
# encode to a base64 str
img = base64.b64encode(buf.read()).decode('utf-8')
matplotlib.pyplot.clf()
matplotlib.pyplot.show = show
Again this is very basic but might do the job in some cases. The base64 encoded image can then be used as needed.
I should be able to revive this over the weekend. Sorry for being late but India is in a pretty bad state due to the pandemic right now.
Sorry for being late
No need to apologize! Thanks for volunteering.
India is in a pretty bad state
Best wishes to you and your family.
An update, I am almost done with reviving the html5 <canvas> based renderer. It seems like the new matplotlib version 3.3.3 has removed support for the _png module. This was being used before by me to read png data / write a png file. I am currently looking at what it is replaced by, so that I can use the same approach for the revived renderer.
The switch seems to have happened somewhere in this commit: https://github.com/matplotlib/matplotlib/commit/370e9a2d5d9e637abc90b3270d368642c69f66c6#diff-0a415dbb618fcfb73e6191c735f6e5a91f530d4a29b8886afdfd56604892de61
I have the renderer ready in my gsoc branch (of my fork).
Can you give me push rights so that I can update the gsoc branch of this repository?
Or should I open a new Pull Request from my fork's branch?
Let me know, Thanks!
I think if you would just open a pull request that would be easiest.
Okay, I am gonna open it from my fork's gsoc branch to this repository's main branch.
What is the status on this? The "from js import document" call is still there. Does the API for set_frontend exist already?
I couldn't get @jtpio 's example to work, but here's a workaround that's working for me:
from matplotlib import pyplot as plt
import io
import base64
import js
class Dud:
def __init__(self, *args, **kwargs) -> None:
return
def __getattr__(self, __name: str):
return Dud
js.document = Dud()
# Create a plot
x1, y1 = [-1, 12], [1, 4]
plt.plot(x1, y1)
# Print base64 string to stdout
bytes_io = io.BytesIO()
plt.savefig(bytes_io, format='jpg')
bytes_io.seek(0)
base64_encoded_spectrogram = base64.b64encode(bytes_io.read())
print(base64_encoded_spectrogram.decode('utf-8'))
It basically just tricks the matplotlib backend into thinking everything is fine. I've only tested it with the default backend
I couldn't get @jtpio 's example to work
For reference JupyterLite removed this workaround in https://github.com/jupyterlite/jupyterlite/pull/911. So not sure it still applies to newer version of Pyodide.
For the record, I created matplotlib-pyodide-worker-contrib (PyPi here).
Install the package and then do:
import matplotlib
matplotlib.use("module://matplotlib_pyodide_worker_contrib.webworker_backend")
plt.show() now returns a string starting with base64,.
When you receive the python result inside the worker, you can then forward the string as a special field to your frontend (i.e. not your worker) and render the base64 svg image if it starts with base64,.