qgrid
qgrid copied to clipboard
qgrid does not play well with 'categorical' dtype and large dataset
I am running qgrid 1.3.1 and pandas pandas==1.0.3. I have a DataFrame of 300K rows, and it runs well.
However, when I convert some of the columns to categorical dtype, the display and filters are very slow.
Here is a dummy example to replicate the problem:
Without categorical dtype (it should run very quickly):
import pandas as pd
import numpy as np
import qgrid
df = pd.DataFrame({'cat1':np.random.randint(low=1, high=1000000, size=400000),
'cat2': np.random.randint(low=1, high=1000000, size=400000),
'cat3': np.random.randint(low=1, high=1000000, size=400000),
'cat4': np.random.randint(low=1, high=1000000, size=400000),
})
qgrid.show_grid(df)
When I convert the columns to categorical (silly in this example):
df2 = df.copy()
df2['cat1'] = df2['cat1'].astype('category')
df2['cat2'] = df2['cat2'].astype('category')
df2['cat3'] = df2['cat3'].astype('category')
df2['cat4'] = df2['cat4'].astype('category')
qgrid.show_grid(df2)
Then, the display is super slow. Moreover, clicking in the filters (on top of the columns) is even slower.