marimo
marimo copied to clipboard
Progress bar doesn't work well with objects that have no len()
Describe the bug
First of all, thanks a bunch for marimo -- it's my favourite tool I ran across in 2024!
One issue I came across was finding a solid tqdm equivalent. The [mo.status.progress_bar](https://docs.marimo.io/api/status.html) does the job pretty well but it sadly doesn't really work with objects that have no len() such as for instance generators, which are used quite often, such as when iterating over df.iterrows().
One option of fixing this would be to provide an option to provide the total to the mo.status.progress_bar call instead of computing it (which inevitably leads to TypeError: object of type 'generator' has no len() ): https://github.com/marimo-team/marimo/blob/8a18849f60add7b51065c87c103af1aae8ff7487/marimo/_plugins/stateless/status/_progress.py#L274-L279
This is also how tqdm handles this (see for instance https://github.com/softhints/Pandas-Tutorials/blob/master/tqdm/1.progress-bars-pandas-python-tqdm.ipynb).
Let me know if this would make sense -- I'd be happy to try out submitting a PR with the change.
Environment
{
"marimo": "0.1.76",
"OS": "Darwin",
"OS Version": "23.2.0",
"Processor": "arm",
"Python Version": "3.11.4",
"Binaries": {
"Chrome": "120.0.6099.216",
"Node": "v20.5.0"
},
"Requirements": {
"black": "23.12.1",
"click": "8.1.7",
"jedi": "0.19.1",
"pymdown-extensions": "10.7",
"tomlkit": "0.12.3",
"tornado": "6.4",
"typing_extensions": "4.9.0"
}
}
Code to reproduce
import marimo
__generated_with = "0.1.76"
app = marimo.App()
@app.cell
def __():
import pandas as pd
import time
import marimo as mo
from tqdm import tqdm
return mo, pd, time, tqdm
@app.cell
def __(mo, time):
for x in mo.status.progress_bar(range(5)):
print(x)
time.sleep(x)
return x,
@app.cell
def __(pd):
# Example 1D list
data = [10, 20, 30, 40, 50]
# Define column name
column = 'Value'
# Create a DataFrame
df = pd.DataFrame(data, columns=[column])
df
return column, data, df
@app.cell
def __(df, time, tqdm):
for y, _row in tqdm(df.iterrows()):
print(y)
time.sleep(y)
return y,
@app.cell
def __(df, mo, time):
for i, _row in mo.status.progress_bar(df.iterrows()):
print(i)
time.sleep(i)
return i,
if __name__ == "__main__":
app.run()
Hey @mrshu! Thanks for the thorough issue report. Adding an optional argument, total, is a great idea. Please do make a PR! And let us know if you need any help.