dataframe-api
dataframe-api copied to clipboard
RFC document, tooling and other content related to the dataframe API standard
This issue is meant to collect libraries that we should be aware of and perhaps take into account (data on how their API looks, impact of choices on those libraries,...
Should a dedicated API/column metadata to efficiently support sparse columns be part of the spec? ## Context It can be the case than a given column has more more than...
This is not even half-baked, but I wanted to gauge interest/feasibility for the spec to encapsulate n-dimensional "columns" of data, equivalent to xarray's [DataArrays](http://xarray.pydata.org/en/stable/user-guide/terminology.html). In that case, the currently-envisioned columns...
This was just asked about at https://twitter.com/__AlexMonahan__/status/1430522318854377475. I'd say we should have a similar argument as https://data-apis.org/array-api/latest/design_topics/copies_views_and_mutation.html. We cannot prevent mutations in the protocol itself, and existing libraries already may...
I think there is consensus (correct me if I'm wrong), on having a 2-D structure where (at least) columns are labelled, and where a whole column share a type. More...
For dataframe interchange, the smallest building block is a "buffer" (see gh-35, gh-38) - a block of memory. Interpreting that is nontrivial, especially if the goal is to build an...
# Categorical dtypes xref gh-26 for some discussion on categorical dtypes. ## What it looks like in different libraries ### Pandas The dtype is called `category` there. See [pandas.Categorical docs](https://pandas.pydata.org/docs/reference/api/pandas.Categorical.html):...
What data types should be part of the standard? For the array API, the types have been discussed [here](https://github.com/data-apis/array-api/issues/15). A good reference for data types for data frames is the...
I have spent a lot of time trying to understand users and their behaviors in order to optimize for them. As a part of this work, I have done numerous...
Missing Data
This issues is dedicated to discussing the large topic of "missing" data. First, a bit on names. I think we can reasonably choose between `NA`, `null`, or `missing` as a...