cftime icon indicating copy to clipboard operation
cftime copied to clipboard

Implementation of an invalid value, similar to numpy's "NaT"?

Open aulemahal opened this issue 4 years ago • 3 comments

With cftime 1.1.0 on Linux (Ubuntu, 64 bit)

I am using cftime through xarray and one my computation returns an array where the values are dates. I want to get the number of days between these dates and a reference date. Some of the pixels of the input array are masked and set to np.nan and when doing the subtraction, it raises an error TypeError: unsupported operand type(s) for -: 'float' and 'cftime._cftime_Datetime360Day.

I was wondering if there is recommended way to handle invalid values with cftime, or a plan to support this? Maybe cftime.datetime could return numpy.NaT when the other object is either numpy.nan or numpy.NaT?

aulemahal avatar Mar 10 '20 20:03 aulemahal

I'm open to suggestions on how to handle missing values in the __add__ and __sub__ methods. Are they established methods that are being used in other packages?

jswhit avatar Mar 12 '20 00:03 jswhit

I believe what I was thinking of is something like pandas' new pd.NA, defined here : https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/missing.pyx They have a lot more operations to cover, but the principle is quite simple : if the other is a scalar, return the NA object, if its an array return an array of the same shape full of NAs. Where the NA object is a singleton that implements the

The same way pandas implements this over numpy's types using "object" arrays, this could also be managed directly in xarray. However, I think cftime would gain from this, keeping its symmetry with np.datetime64.

Maybe a way would be to have one instance of a NAdatetime class that has similar properties than cftime.datetime (maybe subclassing it, or simply an instance of datetime) but returning None on all time parts (year, month, etc). The base datetime would be modified so the __add__, __sub__ and __richcmp__ return this instance whenever it is encountered. If you are interested, I might have some time to work on a PR.

aulemahal avatar Mar 12 '20 19:03 aulemahal

Sure, a pull request would be welcome. I'm not sure we can reuse np.datetime64 for the same reason we can't use python datetime - it only supports one calendar. It certainly seems feasible to check for np.nan or np.nat in the __sub__ and __add__ methods.

jswhit avatar Mar 14 '20 14:03 jswhit