modelskill icon indicating copy to clipboard operation
modelskill copied to clipboard

drop() method to drop obs/mod/variable (reverse sel)

Open jsmariegaard opened this issue 1 year ago • 3 comments

pandas and xarray allow users to drop part of the dataset with a drop method - see e.g. https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop.html. Which is opposite to sel() but should have the same syntax. The drop method should return a new Comparer or ComparerCollection with all data except the dropped.

Two new methods should be added to Comparer and ComparerCollection, respectively, both named drop(). Add them below sel() method https://github.com/DHI/modelskill/blob/db2f7c2d785d36e7bcef0c88c2269cbf10002ac7/modelskill/comparison/_comparison.py#L838 and https://github.com/DHI/modelskill/blob/db2f7c2d785d36e7bcef0c88c2269cbf10002ac7/modelskill/comparison/_collection.py#L301

jsmariegaard avatar Oct 13 '24 13:10 jsmariegaard

Dropping models, obs, and variables seems straightforward. What's a bit less clear is dropping according to the other arguments of sel():

  • start : do we interpret as drop all time values after this time? is it needed? (could also refer users to use sel)
  • end: do we interpret as drop all time values before this time? (could also refer users to use sel)
  • time: do we interpret as drop this specific time, but keep all others? is it needed?
  • area: do we interpret as keep everything outside of this area? is it needed?

I started #460 where I'll add the possibility to drop obs, models, and variables. Do you think the other ones are equally as important to add?

ryan-kipawa avatar Oct 24 '24 15:10 ryan-kipawa

xarray.Dataset.drop discourages the use of drop in favor of drop_vars and drop_sel, probably because of the potential confusion of dropping both rows or columns with the same method.

ecomodeller avatar Oct 24 '24 15:10 ecomodeller

I don't think you will need to drop time or area in the same way as specific modelresults or observations so let's just do obs, models, and variables. I would prefer to keep the name drop() though

jsmariegaard avatar Oct 25 '24 07:10 jsmariegaard