Polars extension
📝 Summary
Fixes issue #3355 by adding LazyFrame extension that allows output of mermaid graphs for polars lazyframe query plans.
🔍 Description of Changes
In this pull request I have added a polars extension that adds a mo namespace to LazyFrames, this namespace includes one "public" method show_graph that accepts the same arguments as polars LazyFrame show_graph and returns an Html object containing the mermaid chart.
Issue #3355 shows side by side outputs.
Before merging it would be necessary to add documentation for this new polars namespace, but I wanted to request your feedback before I took that step.
📋 Checklist
- [X] I have read the contributor guidelines.
- [ ] For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community discussions (Please provide a link if applicable).
- [X] I have added tests for the changes made.
- [X] I have run the code and verified that it works as expected.
📜 Reviewers
@akshayka
The latest updates on your projects. Learn more about Vercel for Git ↗︎
| Name | Status | Preview | Comments | Updated (UTC) |
|---|---|---|---|---|
| marimo-docs | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Jan 7, 2025 0:35am |
| marimo-storybook | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Jan 7, 2025 0:35am |
@MarcoGorelli , if you don't mind, I would appreciate hearing your thoughts on whether implementing this upstream in Polars would be possible, or if we should plan to monkey-patch Polars in marimo instead.
if the solution isn't too complex on the Polars side, then I think this would be welcome!
I don't think we need to monkey-patch anything. showing mermaid should be the default (easier to navigate, adds no deps and not lossy) and instead we can add a formatted for LazyFrame in https://github.com/marimo-team/marimo/blob/main/marimo/_output/formatters/df_formatters.py
If anything could be upstreamed, we could add _repr_mermaid_ to polars, but _repr_mermaid_ is not an official repr from jupyter (however, we can support it)
I don't think we need to monkey-patch anything. showing mermaid should be the default (easier to navigate, adds no deps and not lossy) and instead we can add a formatted for LazyFrame in https://github.com/marimo-team/marimo/blob/main/marimo/_output/formatters/df_formatters.py
If anything could be upstreamed, we could add
_repr_mermaid_to polars, but_repr_mermaid_is not an official repr from jupyter (however, we can support it)
Lazyframes already render fine in marimo according to #3355.
I believe the core issue is modifying Lazyframe.show_graph to show a mermaid diagram. This cannot be accomplished using a formatter in marimo without monkey-patching Lazyframe.show_graph, unless I am misunderstanding something. So it might be preferable to upstream marimo-aware rendering for this function if Polars team is okay with that.
I was looking at this:
Which is possible to move to mermaid since marimo can change the formatting of the last line. You would still run the issue of explicitly calling show_graph(optimized=True), which then would be nice to upstream show_graph(format="mermaid")
But yea @akshayka you are right, without anything upstreamed w.r.t mermaid, we would need to monkeypatch to cover all cases.
Looks like there was some appetite for mermaid formatting in polars: https://github.com/pola-rs/polars/issues/12075
That part seems non-controversial, but could go as far as making the default 'mermaid' instead of 'dot' in marimo
Thank you all for the discussion on this. It sounds like there is already interest on the polars side for introducing support for mermaid, which is great to here. If polars does support mermaid and is ok with us inserting a simple snippet for detecting marimo then I think we could forgo the polars extension which would remove the need to add any imports to the marimo init file.
I think then that the best way forward would be to:
- Add Mermaid support to polars
show_graph - Add a dataframe formatter to Marimo that will change the behavior of leaving a LazyFrame on the last line
- Use polars mermaid generation developed in 1
- Add support to polars
show_graphfor detecting marimo and calling the display functionality developed in 2.
If that sounds good to you all, I can close this PR and then make a new one to handle step 2 above.
That sounds good to me, thank you!
Would you like to work with Polars to lead 1 and 3 as well? If you opened an issue or PR with them, please feel free to tag me.
Yes, I will open a PR in Polars for 1 to get started and tag you in it.
Closing in favor of https://github.com/marimo-team/marimo/pull/5247