message-format-wg icon indicating copy to clipboard operation
message-format-wg copied to clipboard

Well-defined timezone handling

Open dchiba opened this issue 5 years ago • 0 comments

This thread is a spin-off from the requirements gathering (issue #3) about the way timezone is handled in formatting a date/time value.

Original description:

The way a timezone adjustment is made in date/time formatting should be clearly specified. The default timezone conversion behavior should be reasonable and unambiguous. The message author should be able to optionally specify a desired timezone conversion. This is meant to make it easier for applications to support timezones correctly.

@grhoten commented:

It's my preference that the time zone handling should be a part of the calendar object being formatted and not in the message format.

The date/calendar object being formatted could carry timezone information, in which case the formatter can simply print the value in that timezone. However, in many cases the original value is normalized in UTC or otherwise missing timezone information, so the formatter must figure out the timezone to print the value in.

Typically, a calendar object uses a field based internal data structure which contains a timezone field, while a date object uses a simple incremental millisecond count since the Unix time epoch in UTC. The latter type is often chosen as the only native datatype for date/time in modern programming environments as it is easier to evaluate and meets a great majority of the needs of the applications.

Requiring the application to convert a date to a calendar just for presentation in the desired timezone is undesirable. As a matter of fact, modern programming environments support implicit conversion today for basic application scenarios such as printing the current date/time or a timestamp in the user's local time. Applications don't have to specify the presentation timezone, because a 'default timezone is automatically applied for you. For instance, in a browser, the default timezone is the local time where it is running, and new Date().toString() would print the current date and time in the local timezone.

The desired default timezone is often the local timezone of the end user, but there are many exceptions; for instance, on a mobile device, the desired default may not be the timezone of the user's physical location. Instead, it may be that of the user's "home" location.

Here is a summary of different patterns to apply a timezone in formatting a date/time value:

  1. The value is timezone independent so no conversion is applied. e.g. DOB
  2. The "default timezone" as defined by the environment should be used.
  3. The user's preferred/home timezone set in the user profile should be used. e.g. appointments in personal calendar
  4. A business entity has an associated timezone that should be used. e.g. flight schedule, stock price quotes
  5. A timezone may be explicitly specified.

The syntax should use reasonable defaults and support options that enable the message author to specify how timezone should be applied.

=== Examples for Each Pattern ===

Pattern 1, timezone independent scenario: Most date values that represent a specific day are timezone independent. So date values should be printed in a timezone neutral way by default. If time portion was supplied, it should be ignored.

To print a date of birth, the message template may look similar to the following (The markup is for illustration of the concept only. 'dob' is the parameter name, 'date' is the type.):

DOB: {dob, date}

Let's consider a sample output 'DOB: January 29, 2020' in US locale. The argument set to dob may be in several different forms. Here are how the argument value(s) may look like:

  • ISO 8601 date string : "2020-01-29" - no timezone conversion
  • ISO 8601 datetime string : "2020-01-29T12:34:56Z" - 'T' and the rest is ignored
  • A linear millisecond count since the Unix time epoch : 1580336155304 - evaluated in UTC and time portion is chopped off
  • A native Date object : Same as the millisec count as long as the date type is implemented using the Unix epoch based count (this is the case in JavaScript, Java and many others)
  • A Calendar object : {year:2020,month:1,day:29,...} Only the significant fields (year, month and day) are used, the others are ignored.

Pattern 2, default timezone scenario: This pattern is for compatibility with the platform defaults so the behavior would vary depending on the platform's implementation. I feel this pattern does not have to be supported as long as the other patterns are supported. I am still including this as a pattern because it helps understand the differences from the other patterns and why using a platform default could cause problems.

In particular, because of the dependency on the platform's default implementation, the application behavior may be unpredictable or require a specific configuration to make the behavior predictable.

To print a date of birth using the platform's default timezone, the message template may look similar to the following (The markup is for illustration of the concept only. 'dob' is the parameter name, 'date' is the type, 'platform' is for selecting this option.):

DOB: {dob, date, platform}

Let's consider a sample output 'DOB: January 29, 2020' in US locale. The argument set to dob may be in several different forms. Here are how the argument value(s) may look like:

  • ISO 8601 date string : "2020-01-29" - no timezone conversion
  • ISO 8601 datetime string : "2020-01-29T12:34:56Z" - the UTC based moment is converted to the default timezone and the date fields get printed. This may be one day ahead or behind of the date in UTC depending on the UTC offset of the default timezone.
  • A linear millisecond count since the Unix time epoch : 1580336155304 - the UTC based count is converted to the default timezone and the date fields get printed. Effectively the same processing as the previous case.
  • A native Date object : Same as the millisec count as long as the date type is implemented using the Unix epoch based count (this is the case in JavaScript, Java and many others)
  • A Calendar object : {year:2020,month:1,day:29,...} Only the significant fields (year, month and day) are used, the others are ignored. Platform default designation is also ignored, or an exception may be thrown because the expression and argument type are contradicting.

Pattern 3, user's preferred/home timezone: A timestamp value that represents a specific moment is typically to be converted to a presentation timezone, which is normally the timezone of the user's home location. So timestamp values should be printed with timezone convertion applied by default. If timezone was not set, a default may be used.

To print when an object is last updated, the message template may look similar to the following (The markup is for illustration of the concept only. 'instant' is the parameter name, 'datetime' is the type, 'user' selects this option.):

Last updated: {instant, datetime, user}

Let's consider a sample output 'Last updated: January 29, 2020 05:20:30 PM' in US locale where the user's preferred locale is US Pacific time (behind 8 hours from UTC). The argument set to instant may be in several different forms. Here are how the argument value(s) may look like:

  • ISO 8601 date string : "2020-01-29T17:20:30" - no timezone conversion
  • ISO 8601 datetime string : "2020-01-30T01:20:30Z" - the UTC based moment is converted to the user timezone 'America/Los_Angeles' and all fields get printed. The sample output is one day behind of the date in the UTC based timestamp because of the UTC offset of the user timezone (-8 hours).
  • A linear millisecond count since the Unix time epoch : 1580336155304 - the UTC based count is converted to the user timezone and all fields get printed. Effectively the same processing as the previous case.
  • A native Date object : Same as the millisec count as long as the date type is implemented using the Unix epoch based count (this is the case in JavaScript, Java and many others)
  • A Calendar object : {year:2020,month:1,day:29,hour:17,minute:20,second:30,tz:"America/Los_Angeles",...} Only the significant fields (year, month, day, hour, minute, second and tz) are used, the others are ignored. User timezone designation is also ignored, or an exception may be thrown because the expression and argument type are contradicting.

Pattern 4, business entity timezone: A timestamp value that represents a specific moment may be converted for presentation to a timezone associated with a business entity.

When printing the time of intraday stock prices, the stock exchange is the business entity and the timezone of its location would be the relevant business entity timezone in which the timestamps should be presented. In this case, the message template may look similar to the following (The markup is for illustration of the concept only. 'time' is the parameter name, 'datetime' is the type, 'tz' may be the name of a function that supplies the timezone of the stock market, 'price' is where the stock price would go.):

{time, datetime, tz} {price}

Let's consider a sample output 'January 29, 2020 12:00:00 PM $234.56' in US locale in the timezone of the stock market. The argument set to time may be in several different forms. Here are how the argument value(s) may look like:

  • ISO 8601 date string : "2020-01-29T12:00:00" - no timezone conversion; rare, an exception may be thrown for the mismatch.
  • ISO 8601 datetime string : "2020-01-29T17:00:00Z" - the UTC based moment is converted to the timezone specified by the tz function. e.g. 'America/New_York' and 5 hours, the offset between UTC and the US Eastern timezone is applied for the adjustment. The sample output is 5 hours behind the date in the UTC based timestamp.
  • A linear millisecond count since the Unix time epoch : 1580336155304 - the UTC based count is converted to the timezone resolved by the tz parameter. Effectively the same processing as the previous case.
  • A native Date object : Same as the millisec count as long as the date type is implemented using the Unix epoch based count (this is the case in JavaScript, Java and many others)
  • A Calendar object : {year:2020,month:1,day:29,hour:12,minute:00,second:00,tz:"America/New_York",...} Only the significant fields (year, month, day, hour, minute, second and tz) are used, the others are ignored. User timezone designation is also ignored, or an exception may be thrown because the expression and argument type are contradicting.

Pattern 5, Hardcoding scenarios: A spcific timezone may be explicitly set on the message when it is appropriate to do so. This may be the case when the timezone for presentation is known when composing the message.

To print intraday stock prices in an application for NYSE, it may be fine to hardcode the US Eastern timezone. Then the message template may look similar to the following (The markup is for illustration of the concept only. 'time' is the parameter name, 'datetime' is the type, 'America/New_York' is the presentation timezone, 'price' is where the stock price would go.):

{time, datetime, 'America/New_York'} {price}

Let's consider a sample output 'January 29, 2020 12:00:00 PM $234.56' in US locale in US Eastern time. The argument set to time may be in several different forms. Here are how the argument value(s) may look like:

  • ISO 8601 date string : "2020-01-29T12:00:00" - no timezone conversion; rare, an exception may be thrown for the mismatch.
  • ISO 8601 datetime string : "2020-01-29T17:00:00Z" - the UTC based moment is converted to the specified timezone. The sample output is 5 hours behind the date in the UTC based timestamp.
  • A linear millisecond count since the Unix time epoch : 1580336155304 - the UTC based count is converted to the timezone resolved by the tz parameter. Effectively the same processing as the previous case.
  • A native Date object : Same as the millisec count as long as the date type is implemented using the Unix epoch based count (this is the case in JavaScript, Java and many others)
  • A Calendar object : {year:2020,month:1,day:29,hour:12,minute:00,second:00,tz:"America/New_York",...} Only the significant fields (year, month, day, hour, minute, second and tz) are used, the others are ignored. In this case, it may be appropriate to convert from the timezone of the calendar to the specified timezone. This is moot. An exception may be thrown because the expression and argument type are contradicting.

=== Summary ===

Well-defined timezone handling enables the message author to control the presentation timezone to achieve the intended application behavior, by making the timezone conversion behavior predictable. The set of arguments the application needs to supply at runtime is clearly defined and a strict coding pattern is enforced to reduce the chance to encounter an unexpected result. Correct coding patterns to acheive desired application behaviors are promoted through syntax checking of the message, linting of the application and good documentation.

This is opposite from what is typically happening today: Currently, the way application code is written largely defines the presentation behavior so it is hard for the message author to tell if the message will be printed in the intended timezone. It is the application developer who is responsible for writing the correct code that performs the intended timezone conversion. With well-defined timezone handling, application developers would no longer play the primary role, because the message itself describes how timezone conversion is to happen.

dchiba avatar Jan 30 '20 17:01 dchiba