qgrid icon indicating copy to clipboard operation
qgrid copied to clipboard

qgrid does not play well with 'categorical' dtype and large dataset

Open robertour opened this issue 5 years ago • 0 comments

I am running qgrid 1.3.1 and pandas pandas==1.0.3. I have a DataFrame of 300K rows, and it runs well.

However, when I convert some of the columns to categorical dtype, the display and filters are very slow.

Here is a dummy example to replicate the problem:

Without categorical dtype (it should run very quickly):

import pandas as pd
import numpy as np
import qgrid

df = pd.DataFrame({'cat1':np.random.randint(low=1, high=1000000, size=400000), 
                  'cat2': np.random.randint(low=1, high=1000000, size=400000), 
                  'cat3': np.random.randint(low=1, high=1000000, size=400000), 
                  'cat4': np.random.randint(low=1, high=1000000, size=400000),
                  })

qgrid.show_grid(df)

When I convert the columns to categorical (silly in this example):

df2 = df.copy()
df2['cat1'] = df2['cat1'].astype('category')
df2['cat2'] = df2['cat2'].astype('category')
df2['cat3'] = df2['cat3'].astype('category')
df2['cat4'] = df2['cat4'].astype('category')
qgrid.show_grid(df2)

Then, the display is super slow. Moreover, clicking in the filters (on top of the columns) is even slower.

robertour avatar May 22 '20 18:05 robertour