koalas icon indicating copy to clipboard operation
koalas copied to clipboard

implement IndexSlice selection for MultiIndex

Open ikravets opened this issue 4 years ago • 1 comments

When using MultiIndex in Pandas it's customary to select using tuple of slices or values, usually using pd.IndexSlice[] syntactic sugar. This is currently not implemented for Koalas. See https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html for more examples.

import numpy as np
import pandas as pd
import databricks.koalas as ks

df = pd.DataFrame(np.zeros((4,4)),
                  index=pd.MultiIndex.from_product([('a', 'b'), ('c', 'd')]),
                  columns=pd.MultiIndex.from_product([('A', 'B'), ('C', 'D')]),
                 )
kdf = ks.from_pandas(df)
df.loc[(slice(None), 'c'), :]  # OK
df.loc[:, (slice(None), 'C')]  # OK
kdf.loc[(slice(None), 'c'), :]  # ERROR
kdf.loc[:, (slice(None), 'C')]  # ERROR

ikravets avatar May 10 '20 19:05 ikravets

Also missing: kdf.loc(axis=0)[:, 'd']

which now raises: TypeError: 'LocIndexer' object is not callable

WestXu avatar Jul 02 '20 10:07 WestXu