woodwork icon indicating copy to clipboard operation
woodwork copied to clipboard

Keep floats ending in 0 as `Double` instead of `Integer` or `IntegerNullable`

Open ParthivNaresh opened this issue 2 years ago • 2 comments

Currently a column of floats ending in 0 such as [1.0, 3.0, 12.0, etc] is inferred as an Integer ltype or IntegerNullable if there are null values in it.

This should be kept as Double to support downstream uses of imputation, feature engineering, and machine learning.

ParthivNaresh avatar Aug 07 '22 14:08 ParthivNaresh

@ParthivNaresh I don't think I agree with this. If there is no information after the decimal, I believe those values should be inferred and stored as integers. If users need these numeric columns as floating points, the Double logical type should be specified rather than relying on inference.

thehomebrewnerd avatar Aug 08 '22 12:08 thehomebrewnerd

I think this is a particularly weird one. I agree that the most natural thing is to look at a column and realize that the numbers there are actually an integer. But I also feel, from an N=1 user perspective, when I'm working in Python and handjamming some math or something, if I put in the decimal point I'm expecting my data to be treated like a float ever after. I can see it going both ways.

chukarsten avatar Aug 08 '22 15:08 chukarsten