datatable icon indicating copy to clipboard operation
datatable copied to clipboard

import/export pandas frame with NA-aware data types

Open jangorecki opened this issue 4 years ago • 2 comments

dt.Frame is raising an error while trying to import pandas frame where columns are of Int32 so that they can have a missing value.

import pandas as pd
import datatable as dt
pf = pd.DataFrame({
  'id1' : pd.Series([1, pd.NA, 3], dtype='Int32'),
  'id2' : pd.Series([1, 2, 3], dtype='Int32')
})
pf.dtypes
#id1    Int32
#id2    Int32
#dtype: object
d = dt.Frame(pf)
#TypeError: Cannot create a column from <class 'pandas.core.arrays.integer.IntegerArray'>

Moreover, I have a datatable with some NAs in integer column, when doing to_pandas then the whole columns is converted to float

jangorecki avatar Nov 23 '20 11:11 jangorecki

Yes, this is a new feature in pandas 1.0 that we are not using yet.

st-pasha avatar Nov 23 '20 17:11 st-pasha

when resolved this SO can be answered: https://stackoverflow.com/questions/64910029/how-to-convert-pandas-dataframe-to-datatable-frame-containing-int32-nullable-in

jangorecki avatar Dec 15 '20 11:12 jangorecki