Adding a `IamSlice` feature

Open danielhuppmann opened this issue 3 years ago • 3 comments

Description

We often want to know which elements exist on an dimension after filtering, e.g., which variables exist for a specific region (see https://github.com/IAMconsortium/nomenclature/pull/99). This can be done by the following

vars = df.filter(region="Region A").variable

However, this is inefficient because this creates a full (downselected) copy of the (timeseries) data and meta tables.

Proposed Solution

A new class IamSlice which is a derivative of the pd.MultiIndex of the internal _data pd.Series. The IamSlice is returned by the method slice(), which takes the same arguments as filter().

Expected usage

vars = df.slice(region="Region A").variable

Feb 16 '22 07:02 danielhuppmann

+1, great idea!

Feb 16 '22 11:02 gidden

I like the idea in general, but am wondering whether an underlying boolean mask, with maybe a method to extract the indices would not be more composable. and i would also argue for either accepting a slice as first positional argument to df.filter or again to df.__getitem__, so that:

df[df.slice(region="...")] == df.filter(region="...")

Mar 03 '22 13:03 coroa

I like your idea about a boolean mask, but I would not how to implement it...

On the second idea about allowing df[df.slice()], that sounds great and easily doable...

Mar 03 '22 14:03 danielhuppmann

Closed via #637

Aug 31 '22 09:08 danielhuppmann