pymapd icon indicating copy to clipboard operation
pymapd copied to clipboard

thrift_cast method with the mapdtype `BOOL` failed to handle columns having string boolean type

Open Avinash-Raj opened this issue 6 years ago • 1 comments
trafficstars

As for now thrift_cast method on _pandas_loaders.py failed to handle columns having string bool type. We have to do an initial cleanup before converting the boolean type to int like we did for NaN values on boolean column.

**How to reproduce? **

from pymapd._pandas_loaders import thrift_cast
kdf = pd.DataFrame(data={'col1' : [True, 'True', np.nan], 'dt': [datetime.now(), ' 2018-12-12', '2019-12-12'], 'bool': 
                         [False, True, True], 'bnan': [False, True, np.nan]})
thrift_cast(kdf['col1'], 'BOOL', 0)

should raises

ValueError: invalid literal for int() with base 10: 'True'

Avinash-Raj avatar Sep 19 '19 02:09 Avinash-Raj

I'm not sure we really want to get into this level of "do what I mean". We'd have to scan every string column to see what special values might be present, then auto-convert.

randyzwitch avatar Sep 19 '19 12:09 randyzwitch