pandas2 supported dtypes

Obvious / currently supported see here xref #20

integer
unsigned integer
float
complex
boolean
datetime (ns)
datetime w/tz (ns)
timedelta (ns)
period
category
var length string

Informational, may want to think about the desirability of adding later

datetime/timedelta units (ms, us, D)
quantity / unit
decimal #14
aggregate or nested (list/dict) #25
flexible / void / object
coordinates (geo)
fixed length string
bytes
guids (2 x 8 bytes)
interval
IP addresses (both v4 and v6)
fractions, storing numerator and denominator as integers
union
structured (xref to #25)

non-support (try to raise informative errors & point to ready-made solns)

void (a variation on object)
non-supported combinations (e.g. arbitrary dtypes, though maybe a pre-defined union)

Sep 15 '16 20:09 jreback

different datetime precision & ranges (e.g. ms vs ns)

Sep 15 '16 22:09 max-sixty

Under possible

date with no time type (edit: on second thought, is this anything more than a Period[D]?)

Sep 15 '16 23:09 chris-b1

Some of the more extreme:

IP addresses (both v4 and v6)
fractions, storing numerator and denominator as integers

Sep 16 '16 13:09 chrisaycock

@jreback wouldn't we just want a way to have user defined dtypes instead of hardcoding a limited list? Can Dynd help with this?

Sep 21 '16 22:09 datnamer

you certainly can have parameterized types. but completely generic types is a recipe for disaster. what do you think is missing for primitive / logical typing?

Sep 21 '16 22:09 jreback

What do you mean by parameterized? What types can be parameterized and by what? The link is broken.

Sorry I'm a bit lost.

I'm thinking of having a column of distribution objects or linear models or agents with their own attributes.

Sep 21 '16 22:09 datnamer

that's much too high level - though potential for a another library to build on pandas type system is possible

we are taking about columns of primitives

paramterized are things like

datetime64[D]

Sep 21 '16 23:09 jreback

gotcha.

Sep 22 '16 03:09 datnamer

@datnamer either way, pandas needs to have its own metadata implementation (see the logical/physical decoupling discussion in https://pydata.github.io/pandas-design/internal-architecture.html#logical-types-and-physical-storage-decoupling). We do not want to delegate metadata details to a third party library. Data structures and computation are another matter on a case by case basis (i.e. assuming a library conforms to our memory representation expectations, we can use its algorithms). The tight coupling between metadata (numpy dtypes), memory representation, and algorithms/computation is part of why we are in the current mess.

Sep 22 '16 03:09 wesm

maybe thing about this: https://github.com/pydata/pandas/issues/3443, which is about nested dtypes in a single object. On another vein should think about a union type (which is a essentially a restricted looking object dtype); SFrame has these.

Oct 05 '16 11:10 jreback

+1 for sparse.

maybe including subtype in sparse and categorical is useful, like category[int64]

Oct 06 '16 10:10 sinhrks

pandas2 pandas2 copied to clipboard

supported dtypes

pandas2
pandas2 copied to clipboard