netcdf4-python Rounding in num2date

Hello,

I'm getting some rounding issues from netCDF4.netcdftime.utime.num2date with the Gregorian calendar - it occasionally throws up some microseconds where none should be. Is this a known issue, perhaps a limitation of the algorithm?

In [1]: import netCDF4

In [2]: netCDF4.__version__
Out[2]: '1.2.4'

In [3]: u = netCDF4.netcdftime.utime('hours since 1999-12-1')

In [4]: u.num2date(1.0)
Out[4]: datetime.datetime(1999, 12, 1, 1, 0)

In [5]: u.num2date(2.0)
Out[5]: datetime.datetime(1999, 12, 1, 2, 0, 0, 6)

In [6]: u.num2date(3.0)
Out[6]: datetime.datetime(1999, 12, 1, 3, 0)

Also with days:

In [7]: u = netCDF4.netcdftime.utime('days since 1999-12-1')

In [8]: u.num2date(1./24)
Out[8]: datetime.datetime(1999, 12, 1, 1, 0)

In [9]: u.num2date(2./24)
Out[9]: datetime.datetime(1999, 12, 1, 2, 0, 0, 6)

In [10]: u.num2date(3./24)
Out[10]: datetime.datetime(1999, 12, 1, 3, 0)

Many thanks,

David

Oct 04 '16 10:10 davidhassell

I believe that this is a known issues: the documentation string for netcdftime.utime states:

Example usage:

>>> from netcdftime import utime
>>> from datetime import  datetime
>>> cdftime = utime('hours since 0001-01-01 00:00:00')
>>> date = datetime.now()
>>> print date
2006-03-17 16:04:02.561678
>>>
>>> t = cdftime.date2num(date)
>>> print t
17577328.0672
>>>
>>> date = cdftime.num2date(t)
>>> print date
2006-03-17 16:04:02
>>>

The resolution of the transformation operation is approximately 0.1 seconds.

The documentation of date2num and num2date at http://unidata.github.io/netcdf4-python/, should probably mention this too, though.

Oct 04 '16 15:10 ckhroulev

I think that documentation is out of date. The current date2num docs say

"Accuracy is somewhere between a millisecond and a microsecond"

and if I rerun the example in that docstring I get

2016-10-04 16:45:25.858372
17669824.7572
2016-10-04 16:45:25.858378

Oct 04 '16 20:10 jswhit

Pull request #591 updates the docstrings to state millisecond accuracy everywhere.

Oct 05 '16 12:10 jswhit

Hello,

Thank you for clarifying this.

It occurs to me that as the accuracy is 0.001 seconds, couldn't answers be rounded to the nearest millisecond? This would prevent downstream issues that arise when comparisons between two datetimes give the wrong answer due to incorrect due spurious microsecond values.

David

Oct 05 '16 13:10 davidhassell

datetime uses seconds and microseconds, not millseconds.

Oct 05 '16 14:10 jswhit

If you use timedelta to compare two datetime instances, you can set the resolution to milliseconds timedelta(milliseconds=1).

Oct 05 '16 14:10 jswhit

Yes, but you could round to 1000 microseconds :)

I'm not sure what you mean by the timedelta. I'm thinking of operations like date1 < date2 where date[12] are datetime objects, e.g.

>>> from datetime import datetime
>>> datetime(2000, 1, 2, 0, 0, 0, 100007) > datetime(2000, 1, 2, 0, 0, 0, 100006)
True

If the last few microsecond digits are noise, we run into diffuculties.

Oct 05 '16 15:10 davidhassell

disregard my timedelta comment. It's not immediately obvious to me how 1000 microsecond rounding could be implemented, but I'll give it some though.

Oct 05 '16 16:10 jswhit

That's great - thanks.

Oct 05 '16 21:10 davidhassell

After thinking about this some more, I don't see how rounding to the nearest millisecond is going to help. You will still have spurious microseconds showing up in the datetime instances, just as in your example. Am I missing something?

Oct 06 '16 12:10 jswhit

I was thinking, somewhat niaively I suspect, of something along these lines in , e.g. DateFromJulianDay: replacing

microsecond = microsecond.astype(np.int32)

with

microsecond = microsecond.astype(np.int32).round(-3)
if microsecond == 1000000:
   second += 1
   microsecond -= 1000000
   # Uh oh - what if second is now 60 ... ?

So that the returned microseconds value is always one of 0, 1000, 2000..., 998000, 999000

But I see the difficulty in this approach of propagating the rounding up the the ladder datetime elements

Oct 06 '16 13:10 davidhassell

I think that rounding is inevitably going to produce surprising results in some situations. Perhaps it's better to leave it up to the user to round the datetime instances (perhaps using some of the ideas here: http://stackoverflow.com/questions/3463930/how-to-round-the-minute-of-a-datetime-object-python)

For example,

import netCDF4
import datetime
u = netCDF4.netcdftime.utime('hours since 1999-12-1')
d = u.num2date(2.0)
print d
print d-datetime.timedelta(microseconds=d.microsecond) # microsecond floor

1999-12-01 02:00:00.000006
1999-12-01 02:00:00

Oct 14 '16 18:10 jswhit

Maybe related to this I get the following rounding error when the "since" string has a less-than-seconds resolution:

netCDF4.num2date(0., 'seconds since 2013-05-15T00:00:34.653020')
datetime.datetime(2013, 5, 15, 0, 0, 34)

netCDF4.num2date(0.3, 'seconds since 2013-05-15T00:00:34.653020')
datetime.datetime(2013, 5, 15, 0, 0, 34, 300000)

Dec 27 '18 15:12 floogit

cftime does not currently support less-than-second resolution in the units string. If this is an important use case for you, please create an issue at https://github.com/Unidata/cftime/issues.

Dec 27 '18 17:12 jswhit

netcdf4-python netcdf4-python copied to clipboard

Rounding in num2date

netcdf4-python
netcdf4-python copied to clipboard