datatable icon indicating copy to clipboard operation
datatable copied to clipboard

Cannot create a column from <class 'pandas.core.arrays.sparse.array.SparseArray'>

Open gautambak opened this issue 4 years ago • 0 comments

  • Did you find a bug in datatable, or maybe the bug found you? When I try to import a sparse dataframe into datatable, I get the following error: TypeError: Cannot create a column from <class 'pandas.core.arrays.sparse.array.SparseArray'>

I'm getting this error by sending a a one hot encoded dataframe(using pandas get_dummies):

import datatable as dt DT2 = dt.Frame(one_hot)

My table is about 2M rows and 29k columns, the 30k columns are sparse. dtypes: Sparseuint8, 0, object(1) memory usage: 63.0+ MB

  • How to reproduce the bug? I suspect this problem will occur when loading sparse data into datatable.

  • What was the expected behavior? The hope is that the DF gets loaded into Datatables so I can experiment(my end goal is to do a groupby but currently it's super slow using the pandas approach - this library was suggested to me).

  • Your environment? linux

  • Tag the issue with [bug] or [segfault] (depending on whether it crashes Python or not).

  • Thank you for contributing, and sorry for the inconvenience.

Thank you and let me know if you need any further details.

gautambak avatar Dec 28 '20 15:12 gautambak