pandas2 icon indicating copy to clipboard operation
pandas2 copied to clipboard

A "NULL" / "NA" logical type

Open wesm opened this issue 8 years ago • 1 comments

There are a number of places where we "guess" a type (e.g. np.float64) where there is no reasonable choice, e.g. in the CSV parser code.

Many databases have the notion of a "null" type which can be casted to any other type implicitly. For example, if a column in a DataFrame has null type, then you could cast it to float64 or string and obtain an equivalent column of all NA/null values. This would flow through to concat operations.

Figuring this out doesn't strike me as urgent but it would be good to assess how invasive this change would be (in theory it would help with some rough edges, but may well break user code where a "float64 column of all NaNs" was assumed before)

wesm avatar Sep 01 '16 13:09 wesm

See here: https://github.com/pandas-dev/pandas/pull/15892#issuecomment-291641391

we have this de-facto now:

In [6]: Series([pd.NaT])
Out[6]: 
0   NaT
dtype: datetime64[ns]

In [7]: Series([np.nan])
Out[7]: 
0   NaN
dtype: float64

In [8]: Series([None])
Out[8]: 
0    None
dtype: object

but this is really no good and should simply be a 'NULL' type (or maybe 'ANY') until / unless it is assigned / coerced.

jreback avatar Apr 04 '17 21:04 jreback