How to slice rows? Can it fit into the interchange, or is the standard required?

Open MarcoGorelli opened this issue 2 years ago • 1 comments

Came across this yesterday: https://github.com/microsoft/vscode-jupyter/pull/13951

        elif _VSCODE_builtins.hasattr(df, "to_pandas"):
            df = df.to_pandas().iloc[start:end]

This could be improved if they only needed to convert to pandas the part of the data which they needed, i.e.:

elif hasattr(df, "__dataframe__"):
    df = pd.api.interchange.from_dataframe(df.__dataframe__().slice_rows(start, end))

Or, if the Standard were actually available:

elif hasattr(df, "__dataframe__") and hasattr(df, "__dataframe_standard__"):
    df = pd.api.interchange.from_dataframe(df.__dataframe_standard().slice_rows(start, end).dataframe)

So, could this fit into the interchange protocol, or would filtering by rows only be possible for libraries implementing the standard?

It just seems a bit much to require the standard (especially if a separate compatibility package is required for it) if all people need to do is slice rows

Jul 21 '23 09:07 MarcoGorelli

Similarly for to_array_api_compliant_object

They both seem useful enough, and independent of the Standard, that perhaps they could fit together with the interchange?

Jul 21 '23 12:07 MarcoGorelli