ion-docs icon indicating copy to clipboard operation
ion-docs copied to clipboard

Clarify Timestamp documentation

Open PeytonT opened this issue 3 years ago • 2 comments

Some parts of the Timestamp documentation could be reworded for clarity, to reduce the amount of cross-referencing needed to fully understand the specified behavior.

A pair of topics for consideration:

Timestamp component requirements

From the binary documentation:

If a timestamp representation has a component of a certain precision, each of the less precise components must also be present or else the representation is illegal. For example, a timestamp representation that has a fraction_exponent and fraction_coefficient component but not the month component, is illegal.


This seems like it should be moved out of the binary documentation and into the the spec documentation, or removed as redundant with the linked W3C note.

A binary decoding of a Timestamp will interpret the bytes of the Timestamp in sequence as a VarInt offset, then a VarUInt year, then as a VarUInt month, etc. A Timestamp that has an offset, year, fraction_exponent, and fraction_coefficient but no month is unrepresentable. The supposed fraction_exponent and fraction_coefficient will be treated as though they are the VarUInt month, leading to a decoding error if the supposed VarInt fraction_exponent and Int fraction_coefficient don't happen to form a valid VarUInt.

Local Offsets

The text documentation leads off with:

Timestamps represent a specific moment in time, always include a local offset, and are capable of arbitrary precision.

This agrees with the binary documentation:

The 2 non-optional components are offset and year.

But conflicts with the user guide:

Ion timestamps may optionally encode a time zone offset.

And conflicts with itself a few lines down:

In the text format... Local-time offsets... are required on timestamps with time and are not allowed on date values.

But is then further detailed with:

Ion follows the “Unknown Local Offset Convention” of RFC3339

Values that are precise only to the year, month, or date are assumed to be UTC values with unknown local offset.


Synthesizing these points and taking into consideration the semantic isomorphism between the text and binary formats, I reach the following readings:

  • The spec counts having an unknown offset as "including a local offset" when it specifies that Timestamps "always include a local offset".

  • When the user guide says that Timestamps "optionally encode a time zone offset" it means that Timestamps optionally encode a known offset, and otherwise encode the default of "unknown".

  • Points already noted in https://github.com/amzn/ion-docs/issues/91. These can be inferred, but would benefit from being made explicit.

    • Text Timestamps with "date values" are prohibited from having a known offset, so they must always encode the default of unknown local offset. By isomorphism, this restriction also applies to binary Timestamps.
    • Binary Timestamps encode "unknown local offset" with an offset VarInt encoding negative zero (likely 0b11000000, but a less compact form of negative zero would also be valid).

These elements of the spec could be made more explicit, perhaps with text like:

  • https://amzn.github.io/ion-docs/docs/spec.html#timestamp

    • Timestamps represent a specific moment in time, are capable of arbitrary precision, and always include either a known or an unknown local offset.

    • Known local offsets are required on timestamps with time and are not allowed on date values.

    • Values that do not have a known local offset are by default UTC values with unknown local offset.

  • https://amzn.github.io/ion-docs/docs/binary.html#6-timestamp

    • The offset denotes the local-offset portion of the timestamp, in minutes difference from UTC. An unknown local offset is represented as negative zero.

PeytonT avatar Jul 03 '21 00:07 PeytonT

(Note: my usage of the term unambiguous local offset below is not defined in the spec; I'm introducing it for the purposes of this comment to mean that we know whether a local offset is known or unknown. If we need to clarify the spec I'd be open to suggestions for a better term.)

Timestamps represent a specific moment in time, always include a local offset, and are capable of arbitrary precision.

I agree this is potentially confusing since text timestamps without a time component do not explicitly include a local offset in the encoding. However, even in this case, the local offset of such text timestamps is unambiguous : it's unknown local offset (which would be encoded explicitly as -00:00 for text timestamps with a time component). Perhaps this sentence should instead state that the local offset of any timestamp is always unambiguous even for date-only timestamps where it is not explicitly included in the encoding. I believe this is consistent with your first conclusion: "The spec counts having an unknown offset as "including a local offset" when it specifies that Timestamps "always include a local offset"."

I also agree that the "Why Ion?" page's sentence "Ion timestamps may optionally encode a time zone offset" isn't very crisp. "Optional" doesn't seem to be the right word, as both the text and binary encodings either require or prohibit encoding an explicit local offset depending on whether the timestamp has precision beyond date (modulo the issue described in #91). Maybe something more general like "Ion timestamps are capable of conveying time zone offsets" would cover us here.

tgregg avatar Jul 07 '21 00:07 tgregg

Here are few more things about timestamps that probably ought to be clarified in the docs. I have worded this in a Q&A format, but it does not necessarily need to be so.

Timestamps represent a specific moment in time, always include a local offset, and are capable of arbitrary precision.

What is a "moment in time" in this context? The definition of moment found on Google (which is provided by Oxford Languages) has both "a very brief period of time" and "an exact point in time". "Moment in time" should be interpreted in the latter sense of "an exact point in time".

What exactly is meant by precision? Does precision imply the existence of error or uncertainty? Precision refers to the length of the timestamp representation (or the number of significant figures), rather than the units used to measure time (eg. hours, minutes, seconds). Though precision is like significant figures, it does not imply the existence of error or uncertainty. The Ion value itself has no error or uncertainty because it always faithfully represents user data.

How does precision affect the equality of timestamps? Ion equality is different from (e.g.) mathematical equality. For example, -0 and 0 are mathematically equivalent, but are not Ion equivalent. With respect to timestamps, two Ion timestamps with the same time but different precisions (eg. 2021-01-01T00:00Z and 2021-01-01T00:00:00.000000Z) are equal points in time, even though they are not equal Ion values.

Unlike a number, which counts from some “epoch”, arbitrary precision timestamps also allow applications to represent deliberate ambiguity.

Ion timestamps do not have any inherent ambiguity. We should remove the mention of "deliberate ambiguity" or else elaborate on it to avoid any confusion.

popematt avatar Jul 09 '21 20:07 popematt