http-extensions icon indicating copy to clipboard operation
http-extensions copied to clipboard

Date format

Open mnot opened this issue 3 years ago • 16 comments

Some have pushed back on using integers for dates, see eg https://github.com/ietf-wg-httpapi/deprecation-header/issues/11

As I suggested there, one accommodation would be to define a new structured type for them. That could be done in retrofit, or separately.

mnot avatar Jun 15 '22 23:06 mnot

I see no mention of fractional seconds ?

I think we need to ponder that, if the goal is (eventual) convergence for all timestamps in HTTP ?

Considering how much effort we spend on speeding up HTTP, I find the "human readable" argument utterly bogus.

Only a very tiny fraction of these timestamps are ever read by humans, and most are in a context where software trivially can render the number in 8601 format if so desired.

In terms of efficiency, I will concede that, in a HTTP context, it is almost always possible to perform the necessary calculations and comparisons on raw ISO-8601 timestamps, without resorting to the full calendrical conversions, but once all the necessary paranoia is included, I doubt it is an optimization.

My preference is sf-decimal seconds since epoch, (and this is largely why sf-decimal has three decimals in the first place), because it gives us fast processing, good compression and millisecond resolution.

PS: A Twitter poll with only 40 respondents, carried out on the first monday after new-years ? Really ?!

bsdphk avatar Jun 16 '22 05:06 bsdphk

One argument against any "seconds since epoch" representation is that (at least under the prevailing POSIX semantics) it is incapable of representing an instant within a positive leap second (since "each and every day shall be accounted for by exactly 86400 seconds", any such time would be interpreted as being in the first second of the following UTC day).

gibson042 avatar Jun 16 '22 15:06 gibson042

For HTTP I think subsecond precision is an overkill due to RTT / intermediaries.

ioggstream avatar Jun 16 '22 15:06 ioggstream

IMHO:

I like structured format. There are a lot of diagnostics and other software that would not want to do additional transcoding of header elements but just represent them verbatim as they occur. Those are helped if the verbatim representation is already easier readable.

I do not think TZ belongs into a time/date representation. Aka: all date/times in UTC please. TZ can be added somehow as a "location" information, ideally in a different header field. If it is in the same header field, it would be confusing to show the time in UTC but ALSO to indicate the TZ. Maybe this can be resolved by coming up that would make it as obvious as possible that the date/time is NOT adjusted for the TZ shown. Maybe not call it "TZ" but "LOC" so that the "the date/time it TZ adjusted" recognition is not triggered. And of course waste the three characters on "UTC" in the representation.

I do not believe the processing speed argument to be valid. The level of processing speed where just a (sub)second unstructured value would help goes IMHO into the space of CoAP, not HTTP. I'd first try to make HTTP better where it does not just duplicate CoAP efforts. All that binary stuff CoAP needs to do is severely making toolchains more complex IMHO.

I do like the leap-second argument.

I do think msec accuracy should be an option. RTTs within lan/metro environments can easily be in the single digit msec range and you want to be able to diagnose e.g.: REST based broker/bus-type transaction speeds, relative ordering of request/replies.

toerless avatar Jun 16 '22 17:06 toerless

The level of processing speed where just a (sub)second unstructured value would help goes IMHO into the space of CoAP, not HTTP

Seems reasonable to me.

Aka: all date/times in UTC please

+1

relative ordering of request/replies

Not sure this is an "interface" that HTTP exposes / should be relied upon.

ioggstream avatar Jun 17 '22 08:06 ioggstream

where just a (sub)second unstructured value would help goes IMHO into the space of CoAP, not HTTP.

I suspect the experience of pretty much everybody writing and deploying HTTP intermediaries and servers at scale is different -- at those rates, cycles wasted on parsing do matter, especially when architectures have many layers of intermediaries to go through.

mnot avatar Jun 19 '22 09:06 mnot

See linked PR for a proposal.

mnot avatar Jun 21 '22 01:06 mnot

Re leapseconds:

#2170 as written ("excluding leap-seconds") means that ...00:00:60Z timestamps will be illegal, effectively mandating POSIX leap-second (non-)handling.

To make #2170 able to handle leap-seconds requires that every HTTP handling entity needs an up to date leapseconds file, either at system level or application level, and that the upper limit on the "internal data model range" becomes indeterminable, since we cannot predict how many leap seconds will happen in the future, nor what sign they may have.

That is a total no-go from a systems engineering and complexity point of view, in particular as the rest of the world converges on either papering over leap-seconds or abolishing them.

If the textual format cannot handle leap-seconds (without requiring a boat-load of complexity), then, why take the extra expense in CPU-load and code complexity ?

As stated earlier, we should go for the more efficient and less error-prone solution: POSIX time_t (with optional milliseconds) as a sf-decimal.

(If we want to designate the timestamps with a senteniel, '@' is a particular bad choice, since the HPACK huffmann assigns it 13 bits.)

PS: In context of four digit years: Century scale predictions, based on orbital considerations, expect the current leap-second regieme to break down in approx 2000 years, because we will need leap seconds more often than every month.

bsdphk avatar Jun 21 '22 07:06 bsdphk

Do any current HTTP implementations take leap seconds into account?

Regarding HPACK - it's one character. Let's not over-optimise here.

mnot avatar Jun 22 '22 04:06 mnot

Do any current HTTP implementations take leap seconds into account?

I don't know, but leap seconds are explicitly supported by HTTP-date:

  time-of-day  = hour ":" minute ":" second
               ; 00:00:00 - 23:59:60 (leap second)

And the difference between 23:59:60 and following-day 00:00:00 would certainly be relevant for e.g. Last-Modified.

gibson042 avatar Jun 29 '22 00:06 gibson042

I have spent a lot of time over the years searching, but I have yet to see a 23:59:60 timestamp in the wild...

bsdphk avatar Jul 04 '22 08:07 bsdphk

Now there's a picture in my mind of PHK walking the wilds, searching for an elusive date stamp.

mnot avatar Jul 04 '22 08:07 mnot

I don't know what counts as "in the wild", but there is certainly documentation of people observing leap seconds and software failing upon encountering them.

gibson042 avatar Jul 05 '22 14:07 gibson042

Discussion at IETF114 wasn't conclusive, but we now seem to be focusing on these options:

  1. Do nothing. Dates in SF are Integers (unless folks fall back to Strings), and recipients need to know that they're dates and how to handle them.
  2. Create a Date SF type along the lines of the PR. The textual representation is human-friendly, and it's identified as a date in a machine-readable way without special knowledge.
  3. Create a Date SF type but use an Integer; e.g., Date: @1659073897. The textual representation is not human-friendly, but it is identified as a date in a machine-readable way without special knowledge.

I'll observe that if we believe that we'll eventually have a binary representation of structured fields widely in use, (2) is preferable in that it has the best properties -- it's both efficient and human-friendly. Of course, that's not at all certain.

I'll also observe that in the discussions I've had, the people who want human-friendly representation are generally those who are working with tools and developers, "closer" to the application. The infrastructure folks and "lower layer" folks tend to minimise this aspect.

mnot avatar Jul 29 '22 05:07 mnot

PR updated to reflect (3) above.

mnot avatar Aug 04 '22 02:08 mnot

I personally support (2) Date: @2022-02-02T12:12:12Z or Date: @2022-02-02 because it covers similar use cases than HTTP-date.

In case of (3) I think we can just use an Integer SF, like we do Signatures and other specs.

Side note: we sometimes hear that RFC3339 is ambiguous. I think it is, and after ~20 years it probably requires a RFC3339bis that clearly specifies a basic profile ( T, Z uppercase; Always Z). I'd be happy to start this work.

ioggstream avatar Aug 09 '22 14:08 ioggstream

I see no mention of fractional seconds ?

My preference is sf-decimal seconds since epoch, (and this is largely why sf-decimal has three decimals in the first place), because it gives us fast processing, good compression and millisecond resolution.

Why do the number of decimals need to be specified? Surely we could just say the format is nnnnn[.ddddddd] or similar where an integer-only parser stops at whitespace or a '.' while those who care about sub-second parsing continue on until they are satisfied with the resolution or they run out of digits. The senders decides how many decimals to provide, perhaps with a recommendation of three or upto their clock resolution?

So "123456789" and "123456789.1" and "123456789.123456789" are all equally valid.

markdingo avatar Aug 22 '22 03:08 markdingo