dataframe marimo view: if I apply a filter, I have zero records, but I should have a lot of records
Describe the bug
Hi, I apply a filter by value in a dataframe. After I apply it I have zero records in the view, but I should have many.
If it is due to the fact that there are so many records and the filters do not work well in such cases, shouldn't the user be warned?
Environment
{
"marimo": "0.13.2",
"OS": "Linux",
"OS Version": "5.15.167.4-microsoft-standard-WSL2",
"Processor": "",
"Python Version": "3.11.2",
"Binaries": {
"Browser": "121.0.6167.184",
"Node": "v20.17.0"
},
"Dependencies": {
"click": "8.1.8",
"docutils": "0.21.2",
"itsdangerous": "2.2.0",
"jedi": "0.19.2",
"markdown": "3.8",
"narwhals": "1.36.0",
"packaging": "25.0",
"psutil": "7.0.0",
"pygments": "2.19.1",
"pymdown-extensions": "10.14.3",
"pyyaml": "6.0.2",
"starlette": "0.46.2",
"tomlkit": "0.13.2",
"typing-extensions": "4.13.2",
"uvicorn": "0.34.2",
"websockets": "15.0.1"
},
"Optional Dependencies": {
"duckdb": "1.2.2",
"pandas": "2.2.3",
"polars": "1.27.1",
"pyarrow": "19.0.1",
"pycrdt": "0.11.1",
"ruff": "0.11.7",
"sqlglot": "26.16.2"
},
"Experimental Flags": {}
}
Code to reproduce
# csv source https://www.italiadomani.gov.it/content/dam/sogei-ng/opendata/PNRR_Progetti.csv
import marimo
__generated_with = "0.13.2"
app = marimo.App(width="medium")
@app.cell
def _():
import polars as pl
import marimo as mo
return mo, pl
@app.cell
def _(pl):
df = pl.read_csv(
"PNRR_Progetti_01.csv",
separator=";",
has_header=True,
infer_schema_length=100_000,
null_values=["N/A"],
decimal_comma=True
)
return (df,)
@app.cell
def _(df, pl):
date_columns = [col for col in df.columns if col.startswith("Data")]
df_updated = df.clone() # Create a copy to avoid modifying the original DataFrame
for col in date_columns:
df_updated = df_updated.with_columns(pl.col(col).str.strptime(pl.Date, "%d/%m/%Y").alias(col))
return (df_updated,)
@app.cell
def _(df_updated, mo):
mo.ui.table(df_updated, max_columns=None)
return
if __name__ == "__main__":
app.run()
Strange, I cannot reproduce it 🤔
Thank you @Light2Dark . I don't know how to do more debugging though. It always happens to me with this dataframe :(
I take the opportunity to ask you a question: I see that in your dataframe, for example, a histogram does not appear at the "Mission" field. And it happens to me as well. Why are they not generated? Does this occur for all large dataframes?
I see, I can test with more datasets too, but there aren't restrictions on the size for this filter. It works with other dataframes?
Does this occur for all large dataframes?
Yes, the column charts are not done for large dfs. There is a related issue https://github.com/marimo-team/marimo/issues/3104. If we solve this, I think we could increase the limit. Relevant code path
I also cannot reproduce this either (and tried to match your same version of polars/narwhals).
I do see the rows being updated correctly, though. Just the results are empty.
@aborruso can you look at the network request and see if you see results there?
Yes, the column charts are not done for large dfs. There is a related issue #3104. If we solve this, I think we could increase the limit. Relevant code path
Hi, I'm using it with a small dataframe and I have no chart
I also cannot reproduce this either (and tried to match your same version of polars/narwhals).
I do see the rows being updated correctly, though. Just the results are empty.
@aborruso can you look at the network request and see if you see results there?
I have these in the console
Hi, I'm using it with a small dataframe and I have no chart
there is a column limit of 40, row_size of 20k for charts.
you can check the network tab and may see a request ending with .json when you apply the filter
Hi, I am very sorry. I restarted the machine, without changing anything and now everything works. I feel like screaming and I apologize for the time I have wasted.
@Light2Dark @mscolnick I'm reopening the issue, because I realized why I didn't have this problem before and then I didn't have it again.
In my notebook it does not work if I use mo.ui.table(df, max_columns=None)
It works if I use simply df
@aborruso are you still experiencing this?
Tomorrow I will test.
Thank you very much
Hi @mscolnick : yes, it's the same and I have used 0.13.9
Thank you
# csv source https://www.italiadomani.gov.it/content/dam/sogei-ng/opendata/PNRR_Progetti.csv
import marimo
__generated_with = "0.13.8"
app = marimo.App(width="medium")
@app.cell
def _():
import polars as pl
import marimo as mo
return mo, pl
@app.cell
def _(pl):
df = pl.read_csv(
"PNRR_Progetti_01.csv",
separator=";",
has_header=True,
infer_schema_length=100_000,
null_values=["N/A"],
decimal_comma=True
)
return (df,)
@app.cell
def _(df, pl):
date_columns = [col for col in df.columns if col.startswith("Data")]
df_updated = df.clone() # Create a copy to avoid modifying the original DataFrame
for col in date_columns:
df_updated = df_updated.with_columns(pl.col(col).str.strptime(pl.Date, "%d/%m/%Y").alias(col))
return (df_updated,)
@app.cell
def _(df_updated, mo):
mo.ui.table(df_updated, max_columns=None)
return
if __name__ == "__main__":
app.run()