spacepy
spacepy copied to clipboard
pycdf: wrong conversion from CDF_EPOCH to datetime object
CDF_EPOCH values close to the end of the date are wrongly converted to datetime.datetime
objects.
E.g., following code:
import spacepy.pycdf
for cdf_epoch in [63570441599979.984, 63570441599999.984, 63570441600019.984]:
timestamp = spacepy.pycdf.lib.epoch_to_datetime(cdf_epoch)
print("%.3f -> %s" % (cdf_epoch, timestamp.isoformat()))
produces:
63570441599979.984 -> 2014-06-19T23:59:59.979000
63570441599999.984 -> 2014-06-20T23:59:59.999000
63570441600019.984 -> 2014-06-20T00:00:00.019000
Note the middle value's calendar date rounded up to 2014-06-20 while the time stays on 2014-06-19 making the result 1 day ahead of the actual input CDF_EPOCH value.
Observed on Linux with Python 3.7 and Python 2.7, spacepy 0.2.1 (PyPI), Nasa CDF 3.7.1
Thanks for the specific example. I've confirmed this and it looks like it's upstream in the NASA CDF library. I'll report it up and work on a workaround.
This was fixed in cdf 3.8.0.1.
Thanks Bernie. I'm still going to put in a workaround for earlier versions but good to know it's fixed.
Just narrowing down where the rounding issue happens (using an install with CDF 3.7.1) on request...
On the reported date range:
>>> spacepy.pycdf.lib.epoch_to_datetime(63570441599999.98046874)
datetime.datetime(2014, 6, 19, 23, 59, 59, 999000)
>>> spacepy.pycdf.lib.epoch_to_datetime(63570441599999.98046875)
datetime.datetime(2014, 6, 20, 23, 59, 59, 999000)
and on a different date:
>>> spacepy.pycdf.lib.epoch_to_datetime(63672220799999.98)
datetime.datetime(2017, 9, 9, 23, 59, 59, 999000)
>>> spacepy.pycdf.lib.epoch_to_datetime(63672220799999.984)
datetime.datetime(2017, 9, 10, 23, 59, 59, 999000)
>>> spacepy.pycdf.lib.epoch_to_datetime(63672220799999.98046874)
datetime.datetime(2017, 9, 9, 23, 59, 59, 999000)
>>> spacepy.pycdf.lib.epoch_to_datetime(63672220799999.98046875)
datetime.datetime(2017, 9, 10, 23, 59, 59, 999000)
So the rounding is happening between .98046874 and .98046875. (I tested a lot of intermediate values...) Although if CDF_EPOCH is in milliseconds, then the rounding here is at the picosecond level and we're around the 16th significant digit which is kind of at the edge of the precision for doubles. Having this fixed upstream is useful, but it's also a good reminder that we should be preferring CDF_TIME_TT2000 (or CDF_EPOCH16, I suppose) as well as sticking to recent versions of the CDF library...
On which note @jtniehof - do we want to update our minimum supported version of CDF? We're currently listing 2.7.0, which is more than two decades and a lot of bug fixes old.
Thanks @drsteve ...that does suggest a floating point limited precision bug in the earlier CDF libraries. I can probably think of an appropriate fudge.
To the minimum version, #198 will be changing some defaults along with the Python 2 removal, and I'll look at updating the minimum after that is complete. There's a bug workaround that's fixed in 3.4.1, and TT2000 support was added in 3.4.0 (the CDF changelog says 3.3.2, but there were bugs that mean I don't touch it until 3.4.0). 3.5.0 is from 2012, so that's probably going to be a reasonable place to jump (and update #423 to be against that instead of 2.7, if I don't get to it before then.)