datatable
datatable copied to clipboard
import/export pandas frame with NA-aware data types
dt.Frame is raising an error while trying to import pandas frame where columns are of Int32 so that they can have a missing value.
import pandas as pd
import datatable as dt
pf = pd.DataFrame({
'id1' : pd.Series([1, pd.NA, 3], dtype='Int32'),
'id2' : pd.Series([1, 2, 3], dtype='Int32')
})
pf.dtypes
#id1 Int32
#id2 Int32
#dtype: object
d = dt.Frame(pf)
#TypeError: Cannot create a column from <class 'pandas.core.arrays.integer.IntegerArray'>
Moreover, I have a datatable with some NAs in integer column, when doing to_pandas then the whole columns is converted to float
Yes, this is a new feature in pandas 1.0 that we are not using yet.
when resolved this SO can be answered: https://stackoverflow.com/questions/64910029/how-to-convert-pandas-dataframe-to-datatable-frame-containing-int32-nullable-in