pyam icon indicating copy to clipboard operation
pyam copied to clipboard

Remove automated sorting of data

Open danielhuppmann opened this issue 1 year ago • 2 comments

The pyam package currently automatically sorts the _data series and meta dataframe by their index. This makes it easy for consistency, assert-frame-equal and some operations like interpolation. But it can have unintended consequences in cases where ordering is forgotten, e.g. #811

Also, the repeated ordering is probably not very resource-efficient for large IamDataFrame instances.

For pyam 3.0, I suggest to drop the automated ordering on initialization and rename/aggregation/etc. methods, and instead provide a sort() method that can be called explicitly. We could also have a kwarg on all relevant methods whether to sort, but that may not effective on the effort-vs.-benefit trade-off.

@phackstock @gidden @znicholls, any thoughts?

danielhuppmann avatar Feb 14 '24 10:02 danielhuppmann

I like the idea of making sorting optional. I cannot really think of a use case off the top of my head where I care or depend on the order of data. For assert-frame-equal we would then also introduce a keyword argument that would switch whether or not order is considered when checking for equality.

phackstock avatar Feb 14 '24 15:02 phackstock

Reminder: not sorting the time column may cause confusion when working with the wide timeseries format (e.g., write to xlsx)

danielhuppmann avatar Feb 21 '24 14:02 danielhuppmann