woodwork
woodwork copied to clipboard
Stop converting ['1', '2', '3'] to string dtype before initializing as a Double column
Pandas bug was stopping us from being able to initialize Woodwork on a column of numeric strings, so we had to convert to the string
dtype first.
https://github.com/alteryx/woodwork/pull/755/files#diff-75cb54847db4ed09b32148b93f1f543df16a421473154cb17085ff4932277ba8L402
This should be removed after the bug is fixed.
- https://github.com/pandas-dev/pandas/issues/40729
The original pandas issue (https://github.com/pandas-dev/pandas/issues/40729) that caused us to add this change is fixed, though the change mentioned in this issue no longer seems to be present. So this issue may be able to be closed.
pandas 1.4.3
- I still get the same error
pandas main (Aug 13), pandas-1.5.0.dev0+1285.g60b4400491
import pandas as pd
series = pd.Series(['1', '2', '3', '4'], dtype='object')
series.astype(pd.Float64Dtype())
- No error
0 1.0
1 2.0
2 3.0
3 4.0
dtype: Float64
So I believe we blocked on this issue until pandas 1.5.0 comes out
Correct, I was more saying that I think the workaround I created this issue about may no longer be present (it's no longer in that test on woodwork main), in which case this issue can be closed.