sklearn-pandas icon indicating copy to clipboard operation
sklearn-pandas copied to clipboard

Error when using custom dataframe types (modin/lux/dask dataframes)

Open cBournhonesque opened this issue 4 years ago • 0 comments

Some packages convert the dataframe type into other types that have a similar interface (modin/dask).

This causes issues with sklearn-pandas, more specifically this function: https://github.com/scikit-learn-contrib/sklearn-pandas/blob/master/sklearn_pandas/dataframe_mapper.py#L311

I fixed this issue by replacing

    def get_dtype(self, ex):
        if isinstance(ex, np.ndarray) or sparse.issparse(ex):
            return [ex.dtype] * ex.shape[1]
        elif isinstance(ex, pd.DataFrame):
            return list(ex.dtypes)
        else:
            raise TypeError(type(ex))

with

    def get_dtype(self, ex):
       if isinstance(ex, np.ndarray) or sparse.issparse(ex):
           return [ex.dtype] * ex.shape[1]
       else:
           return list(ex.dtypes)

But there must be a better solution. How could we handle those separate types of dataframes?

cBournhonesque avatar Nov 17 '21 18:11 cBournhonesque