stdlib icon indicating copy to clipboard operation
stdlib copied to clipboard

Proleptic Gregorian Calendar and ISO-8601 support

Open urbanjost opened this issue 5 years ago • 13 comments

I am wondering what the interest is in Civilian Calendar functions (Starting with Proleptic Gregorian Calendar and ISO-8601 support but open to other calendar systems) ? I think that a basic implementation should be relatively quick to produce and close-ended.

An example of the scope of functionality I am initially envisioning as being part of a standard interface is encompassed in several sources

Current technology:

  • The C/C++ date and time interfaces ( Should the interface be essentially a binding to the C interface leveraging the ISO_C_BINDING?) (C)
  • The Fortran Wiki has links to several popular interfaces
  • M_time(FORTRAN:PD)
  • datetime-fortran (FORTRAN)
  • libdate from the FLIBS repository (FORTRAN)
  • High-precision Fortran libraries such as the NASA SPICElib library is a good example if you care about Leap Seconds, Orbital Mechanics, Astronomy and GPS/Satellite communications, for example. But I personally consider this level of precision above this initiative (at this point, at least). There are some very interesting ways to flexibly handle the precision up to fourteen places after the decimal point. See routines like the following that talk about leap second engines and so on (unfortunately the page is alphabetical, not sorted by category):
  • GR2JUL - Gregorian to Julian Calendar
  • UTC2ET - UTC to Ephemeris Time
  • TPARSE - Parse a UTC time string
  • TPICTR - Create a Time Format Picture
  • TIMOUT - Time Output
  • TCHCKD - Time components are checked
  • TCHECK - Time Check
  • TEXPYR - Time --- Expand year
  • TIMDEF - Time Software Defaults
  • UNITIM - Uniform time scale transformation
  • TTRANS - Time transformation
  • TPARCH - Parse check---check format of strings
  • TPARTV - Time string ---parse to a time vector
  • TSETYR - Time --- set year expansion boundaries
  • ETCAL - Convert ET to Calendar format
  • SPKLEF - S/P Kernel, Load ephemeris file
  • SPKUEF - S/P Kernel, Unload ephemeris file (FORTRAN)

urbanjost avatar Jan 10 '20 01:01 urbanjost

Thanks for starting this issue. I believe we should have date and time handling in stdlib. Similar to #103 and #104, when we have most of date and time functionality in stdlib (no matter what API), I will happily sunset the datetime-fortran project and direct users to stdlib. Until then...

From the cursory inspection of M_time, flibs/libdate, and my knowledge of datetime-fortran, my impression is:

  • All three provide a datetime (or similarly named) class, with integer components for year, month, and so on, and arithmetic comparison operators;
  • flibs/libdate seems most basic of the three in terms of functionality, but nevertheless useful reference;
  • M_time allows adding and subtracting seconds from a datetime to return a new datetime;
  • M_time also provides some interesting astrological functions related to phases of the moon and similar that I didn't see in other two;
  • datetime-fortran provides a timedelta class which allows adding or subtracting arbitrary periods of time: You can do datetime +/- timedelta (returns a datetime), timedelta +/- timedelta (returns a timedelta), or datetime - datetime (returns a timedelta); datetime and timedelta are closely modeled after Python's counterparts;
  • datetime-fortran also provides interfaces to C tm struct and strftime and strptime functions, which I didn't see in other libraries.

Being the central piece, it seems to me like the first natural step for us to discuss is the datetime derived type. From datetime-fortran:

type :: datetime
  integer :: year = 1
  integer :: month = 1
  integer :: day = 1
  integer :: hour = 0
  integer :: minute = 0
  integer :: second = 0
  integer :: millisecond = 0
end type datetime

Using initializers for components allows one to instantiate with just datetime(). Perhaps that's not so useful for year, month, and day, but for hour, minute, second, and millisecond it is because it allows you to easily work with just dates: datetime(2020, 1, 12) will represent midnight of today. This is similar to Python's datetime.datetime.

@certik @marshallward @zbeekman @jvdp1 @ivan-pi what do you think?

milancurcic avatar Jan 12 '20 22:01 milancurcic

FYI: there are applications that require time computations to be more accurate than 1 millisecond.

jacobwilliams avatar Jan 12 '20 23:01 jacobwilliams

Of course. For getting current time, pure Fortran will get us down to a millisecond. For microseconds we may need to interface C (Python dateteme gets microseconds).

For arithmetic, we can go as fine as we want. Will microseconds suffice? Can you list applications that you know of that would need this or more precise timekeeping?

Perhaps we can treat this more generally by declaring datetime % second as a real of a high precision.

milancurcic avatar Jan 12 '20 23:01 milancurcic

Actually have some of my own that need to go way below a millisecond. in the one example module (M_TIME) most of the computations are done with double precision but if you go thru the DAT array you do round to milliseconds, primarily because the DATE_AND_TIME routine only returns milliseconds. A large amount of standard calendar use often does not go below a second for formatted output. There were a couple of reasons to basically extend the DATE_AND_TIME function in M_TIME, including the limitations it had; some were probably similiar to why DATE_and_TIME has no unit smaller than a millisecond.

A good point. What precision do we want a datetime structure to be able to hold?

On January 12, 2020 at 6:13 PM Jacob Williams [email protected] wrote:

FYI: there are applications that require time computations to be more accurate than 1 millisecond.

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/106?email_source=notifications&email_token=AHDWN3MTJ6E7BEBDYFZBNN3Q5OP2JA5CNFSM4KFBDTJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIXG64I#issuecomment-573468529 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AHDWN3IADNRGRFGDXHDG4PLQ5OP2JANCNFSM4KFBDTJQ .

urbanjost avatar Jan 13 '20 02:01 urbanjost

For a lot of orbit computations we will use ephemeris time (a count of seconds since 1/1/2000). As a double precision number, that gives around 1 microsecond precision for the present day. I am aware of precise orbit determination and deep space navigation applications that require more accuracy than that. I think some people use femtoseconds (1e-15 sec).

I'm thinking more for calculations. The precision of getting the system time doesn't really matter too much to me personally.

jacobwilliams avatar Jan 13 '20 03:01 jacobwilliams

A added a link to the SPICElib library to an alphabetical index, because it has routines with a settable precision down to 14 digits after the decimal, allows for correcting for the accursed leap-seconds with a "leap-second kernel", uses a # character in the output formats to let the user specify how many digits to display. Looking for a link to SPICElib documentation on just the time functions, which I thought existed but have not found. If we look at going higher precision there are lessons to be learned there.

urbanjost avatar Jan 13 '20 08:01 urbanjost

Combining the questions about precision that have come up and looking at the high-precision SPICElib documentation I think the type should have an element that is the precision of the "millisecond" element in the sense that it is the number of digits that are useful after the decimal place in a floating point representation of the "milliseconds" .

Assuming int32 values that would let the new precision element be anything from 0 to 9. To allow the precision value to be up to 18 everything would probably be type INT64. Note that the JULIA language uses INT64 values, but has three fields for partial fractions named millisecond, microsecond, and nanosecond.

Since DATE_AND_TIME is the standard routine for getting the system clock time and is mute about whether the milliseconds returned are valid or not I think it would be useful to ask for an enhancement to DATE_AND_TIME with a ninth field designating precision in the VALUE array returned or to return a new optional parameter called PRECISION. I see where a number of implementations like PYTHON have warnings about milliseconds being potentially inaccurate, stating some system clocks return values only to the second without any way to detect this condition in DATE_AND_TIME itself.

So assuming milliseconds are being returned correctly the precision element would be 3 for DATE_AND_TIME, for example; but if not other precision values would at least let you know.

Maybe even a value of -1 would mean to not even trust the seconds field and so on but I do not think that is required without a use case but it might be useful. If you converted a date like "Jan. 1st 2020" then you really did not know the hour, minute, second, for example

Are we assuming values have to be "valid" times and so negative values are not allowed? Actually in some stuff I wrote if you had a date representing "Jan 11th 2020 at noon" and then subtracted 100 from the hour field and then queried it as a civil calendar string it would return 100 hours earlier, so I did not "check for valid values" like making sure hours were positive between 0 and 24 for example.

So, a PRECISION element or not? Or leave it at milliseconds, add other fields like JULIA, or go with a floating point value?

Are values allowed to be "illegal", like negative numbers?

If they cannot be "illegal" should there be a way to flag the value as "unknown"; maybe a logical value for a KNOWN attribute for each field?

If PRECISION is allowed what should the required allowed range of the value be? If it is to be > the number of digits in "huge(1_int32)" can hold then it has to be a bigger than usual integer like kind=int64

And are timezones and daylight savings and leapseconds not included in the type? For scientific computation that is usually ignored, Civil Calendars sometimes ignore leapseconds in computations, ... it can get complicated but I think it is important to select a model or the type described is ambigious. I would assume the type described so far is a ZULU time because there is no time zone, but I am not sure.

urbanjost avatar Jan 13 '20 17:01 urbanjost

I wonder if this might be a use case for parameterized derived types (I confess I've never found a use case before)?:

type :: datetime(sec_precision)
  integer,len :: sec_precision = 3  ! defaults to milliseconds
  integer :: year = 1
  integer :: month = 1
  integer :: day = 1
  integer :: hour = 0
  integer :: minute = 0
  integer :: second = 0
  integer,dimension(sec_precision) :: fractions_of_sec = 0
end type datetime

jacobwilliams avatar Jan 13 '20 20:01 jacobwilliams

My use case

geospace plasma physics simulations / data assimilation / remote sensing, I feel the minimum necessary precision is that integer microseconds are necessary in a datetime type. I.e. more precise is fine too.

A typical case is a simulation evolving on microsecond timescales, assimilating data from satellites, radars, GNSS, etc. with sensor cadences from 100 microseconds to 1 second.

Suggestion

  • int64 microseconds
  • all timekeeping parameters int64
  • convenience methods input/output real64 time e.g. for seconds and fractional second
  • some internal calculation would use real64

Question

  • what is the compromise smallest datetime "tick" to settle on? Windows uses int64 100 nanosecond tick.
  • Is the epoch window with int64 nanoseconds too small for our Fortran audience?

scivision avatar Jan 14 '20 17:01 scivision

Another FYI: the IAU SOFA library has some routines for time computations. http://www.iausofa.org

jacobwilliams avatar Jan 20 '20 21:01 jacobwilliams

After looking through a good number of other languages it seems that int64 is commonly used in most recent implementations. Precision is handled in a variety of ways. Python includes additional fields called millseconds, microseconds, and nanoseconds instead of a precision value. I find that confusing and a bit vague and prefer the idea of a single integer for fractional seconds with a precision field saying how many digits of accuracy are in that. In several cases precision is lost going from single high-precision values that are in a variety of forms from Julian to Unix Epoch time to a variety of others to Civilian times. Scientific calculations seem to almost always use UTC/GMT/ZULU time and avoid the issues with timezones and Daylight Savings times; but often include corrections for leap seconds. I'm torn between whether we should seperate the two main classes (high precision computations and high-precision timing versus general Civilian calendar dates and times which seem in practice to rarely even use fractional seconds). But especially if we try to have one type that covers it all, I agree integer values should be int64 and floats should be at least real64. I find it unlikely any relevant system would not have real64 so that seems reasonable. But most scientific calculations I found just used intrinsic types for precise computations, so I am still wondering whether it is reasonable to make a single type for all time-related functions. And I have not seen much dialog on whether the type should include timezone information. It seems in business applications in particular that it is important to know the data was generated in a particular time zone; where it appears to be rare that scientific calculations care much about timezones when doing computations; and the timezones appear to just be a potential cause of errors. Still, I would say we go with the proposed type but specify it to be int64, and add a numeric-only timezone field similar to the Fortran intrinsic DATE_AND_TIME(3), and add another field called PRECISION that indicates the number of significant digits in the "millisecond" field and rename it to something like "fractional seconds". Promising that much precision basically implies we consider leap seconds, which I was hoping to at least initially ignore..

urbanjost avatar Jan 23 '20 06:01 urbanjost

type :: datetime
  integer :: year = 1
  integer :: month = 1
  integer :: day = 1
  integer :: hour = 0
  integer :: minute = 0
  integer :: second = 0
  integer :: millisecond = 0
end type datetime

Often I want to work with just dates. Should a date type with just year, month, day components also be part of stdlib?

Beliavsky avatar Oct 19 '21 19:10 Beliavsky

The QSAS Time Format Conversion Library which is part of PLPlot Enhancement Libraries uses a combination of an integer and a double to represent times with an accuracy of 0.01 ns.

ivan-pi avatar May 23 '22 10:05 ivan-pi