pymapd
pymapd copied to clipboard
thrift_cast method with the mapdtype `BOOL` failed to handle columns having string boolean type
trafficstars
As for now thrift_cast method on _pandas_loaders.py failed to handle columns having string bool type. We have to do an initial cleanup before converting the boolean type to int like we did for NaN values on boolean column.
**How to reproduce? **
from pymapd._pandas_loaders import thrift_cast
kdf = pd.DataFrame(data={'col1' : [True, 'True', np.nan], 'dt': [datetime.now(), ' 2018-12-12', '2019-12-12'], 'bool':
[False, True, True], 'bnan': [False, True, np.nan]})
thrift_cast(kdf['col1'], 'BOOL', 0)
should raises
ValueError: invalid literal for int() with base 10: 'True'
I'm not sure we really want to get into this level of "do what I mean". We'd have to scan every string column to see what special values might be present, then auto-convert.