zed
zed copied to clipboard
Presenting times in non-UTC timezones
tl;dr
Issues like https://github.com/brimdata/zui/issues/1057 and https://github.com/brimdata/brimcap/issues/352 reflect users' desire to sometimes work in non-UTC timezones, such as their local time. The strftime
function added in #5197 provides the %z
and %Z
formatting directives that could allow output of such timezones, but right now they only reflect the UTC timezone.
To improve on this, we've discussed approaches for a user to provde a hint for an alternate timezone such that the rendered time value and output of %z
/ %Z
directives would reflect the delta. In preliminary discussions on this topic, we've reached some consensus on a proposal to add a third, optional parameter to our strftime
that would provide that offset/timezone hint.
Details
Comparison With Other Tools
Users are likely to compare our timezone support to that of other tools and/or copy string-based time values between Zed tools and other tools, so it may be helpful to know how others approach this problem. Since they both support the same formatting options as our strftime
, I happened to start with jq
and GNU Date.
Since all these tools offer their own default/alternate formats, as a neutral starting point, I'll start from a UTC seconds-since-epoch value 1723662771
that roughly matches the time when this issue was written. I'm ignoring the fractional seconds for now because Zed has orthogonal catch-up to do there (#5220).
jq
jq
has its strftime
function that supports formatting options like Zed's function of the same name. Per their docs, it only is intended to format a time in UTC (well, they call it GMT).
$ echo '1723662771' | jq 'strftime("%Y-%m-%dT%H:%M:%S")' -
"2024-08-14T19:12:51"
For output in non-UTC timezones, jq
offers a separate strflocaltime
function. By default it outputs time in the local timezone for the system on which jq
is running, though it overrides that behavior if the TZ
environment variable is set to something else. My Macbook happens to be in Pacific Daylight Time at the moment, so:
$ echo '1723662771' | jq 'strflocaltime("%Y-%m-%dT%H:%M:%S")' -
"2024-08-14T12:12:51"
$ echo '1723662771' | TZ=US/Eastern jq 'strflocaltime("%Y-%m-%dT%H:%M:%S")' -
"2024-08-14T15:12:51"
Adding the %z
or %Z
directives can show the offset or timezone abbreviation, respectively.
$ echo '1723662771' | jq 'strflocaltime("%Y-%m-%dT%H:%M:%S%z")' -
"2024-08-14T12:12:51-0800"
$ echo '1723662771' | jq 'strflocaltime("%Y-%m-%dT%H:%M:%S %Z")' -
"2024-08-14T12:12:51 PST"
Though a close inspection of that output reveals a bug, which I see has already been filed as https://github.com/jqlang/jq/issues/1912: They should have said -0700
and PDT
, not -0800
and PST
! Let's not have that bug in Zed. π
Going back to the regular strftime
and its mission to be for UTC only, indeed it doesn't change the time value itself based on local system time or TZ variable (i.e., it's still in UTC), but it does still reflect a non-UTC timezone in the %z
or %Z
directives if invoked, which seems weird and like something else we'd not want to mimic. π
$ echo '1723662771' | jq 'strftime("%Y-%m-%dT%H:%M:%S%z")' -
"2024-08-14T19:12:51-0800"
$ echo '1723662771' | jq 'strftime("%Y-%m-%dT%H:%M:%S %Z")' -
"2024-08-14T19:12:51 PST"
GNU Date
GNU Date effectively behaves like jq
's strlocaltime
by default (i.e., reflects local system timezone, or what's in TZ
if present... though it does the right thing with daylight savings time!) with the formatting directives invoked via +
.
$ gdate --date=@1723662771 +"%Y-%m-%dT%H:%M:%S%z"
2024-08-14T12:12:51-0700
$ TZ=US/Eastern gdate --date=@1723662771 +"%Y-%m-%dT%H:%M:%S%z"
2024-08-14T15:12:51-0400
$ gdate --date=@1723662771 +"%Y-%m-%dT%H:%M:%S %Z"
2024-08-14T12:12:51 PDT
$ TZ=US/Eastern gdate --date=@1723662771 +"%Y-%m-%dT%H:%M:%S %Z"
2024-08-14T15:12:51 EDT
If -u
is added, now GNU Date behaves more like jq
's strftime
and renders in UTC, though unlike the weird jq
behavior cited above, it appears strict in always reflecting UTC even in the timezone offset/abbreviations in %z
or %Z
.
$ gdate -u --date=@1723662771 +"%Y-%m-%dT%H:%M:%S%z"
2024-08-14T19:12:51+0000
$ TZ=US/Eastern gdate -u --date=@1723662771 +"%Y-%m-%dT%H:%M:%S%z"
2024-08-14T19:12:51+0000
$ gdate -u --date=@1723662771 +"%Y-%m-%dT%H:%M:%S %Z"
2024-08-14T19:12:51 UTC
$ TZ=US/Eastern gdate -u --date=@1723662771 +"%Y-%m-%dT%H:%M:%S %Z"
2024-08-14T19:12:51 UTC
Zed
Repro is with Zed commit 71e35c5.
$ zq -version
Version: v1.17.0-20-g71e35c5d
While Zed's strftime
supports %z
and %Z
for completeness, at the moment they only reflect UTC timezone. No attempt is made to reflect the local system time or check the TZ
environment variable for a hint, so this all seems to match with GNU Date's -u
behavior.
$ echo '1723662771' | zq -z 'yield time(this * 1000000000) | strftime("%Y-%m-%dT%H:%M:%S%z", this)' -
"2024-08-14T19:12:51+0000"
$ echo '1723662771' | TZ=US/Eastern zq -z 'yield time(this * 1000000000) | strftime("%Y-%m-%dT%H:%M:%S%z", this)' -
"2024-08-14T19:12:51+0000"
$ echo '1723662771' | zq -z 'yield time(this * 1000000000) | strftime("%Y-%m-%dT%H:%M:%S %Z", this)' -
"2024-08-14T19:12:51 UTC"
$ echo '1723662771' | TZ=US/Eastern zq -z 'yield time(this * 1000000000) | strftime("%Y-%m-%dT%H:%M:%S %Z", this)' -
"2024-08-14T19:12:51 UTC"
However, if provided a string representation of a time value that's got a non-UTC offset or timezone abbreviation, Zed's cast to its time
type converts it to the appropriate UTC value. So, starting from the Pacific Time outputs from GNU Date shown previously, we get the same UTC outputs for both of these.
$ echo '"2024-08-14T12:12:51-0700"' | zq -z 'yield time(this)' -
2024-08-14T19:12:51Z
$ echo '"2024-08-14T12:12:51 PDT"' | zq -z 'yield time(this)' -
2024-08-14T19:12:51Z
The same is true if there's a colon in the timezone offset.
$ echo '"2024-08-14T12:12:51-07:00"' | zq -z 'yield time(this)' -
2024-08-14T19:12:51Z
I point this out because I found that even as a non-string time
literal, the language is already prepared to interpret timezone offsets correctly, but only if they include a colon.
$ echo '2024-08-14T12:12:51-07:00' | zq -z 'yield this' -
2024-08-14T19:12:51Z
$ echo '2024-08-14T12:12:51-0700' | zq -z 'yield this' -
stdio:stdin: format detection error
arrows: schema message length exceeds 1 MiB
csv: line 1: delimiter ',' not found
json: strconv.ParseFloat: parsing "2024-08-14": invalid syntax
line: auto-detection not supported
parquet: auto-detection requires seekable input
tsv: line 1: delimiter '\t' not found
vng: auto-detection requires seekable input
zeek: line 1: bad types/fields definition in zeek header
zjson: line 1: malformed ZJSON: bad type object: "2024-08-14T12:12:51-0700": unpacker error parsing JSON: invalid character '-' after top-level value
zng: unknown ZNG message frame type: 3
zson: ZSON syntax error
I peeked at the code and I see how this all fits together. The Zed casting code for time
(i.e., what would be used to convert string-based timestamps) ultimately depends on https://github.com/araddon/dateparse which is very flexible in which formats it accepts, hence offsets with and without colons are both fine. Meanwhile the ZSON parser (i.e., what would parse ZSON time
literals) depends on the RFC3339Nano
mode of Go's time.parse
, and RFC3339 only supports offsets with colons.
The ZSON spec describes time
as a "an RFC 3339 UTC date/time string". Since the parser is ready to accept them with an offset as long as there's a colon, perhaps we could add a flag at some point to enable printing the ZSON time
values with a specified offset. If we did that it would probably also make sense to add an formatting directive to Zed's strftime
function to include the colons, since https://github.com/lestrrat-go/strftime doesn't currently offer one. (#5253) FWIW, other tools like GNU Date or the JavaScript library https://github.com/samsonjs/strftime provide offset-with-colon via directive %:z
.
Zed Proposals
In a preliminary discussion on this topic, @mattnibs made the following proposal:
I could imagine
strftime
taking a thirdduration
argument that specifies the offset of timezone and then you could use the%z
directive to display the timezone. We could have a timezone function that would return the offset.Eg
strftime(β%zβ, now(), tz(βPSTβ))
@nwt had a preference for a string argument instead of a duration
+function, such that the user could directly input an offset or supported timezone name/abbreviation.
Implementation details aside, these ideas do seem like they'd provide the base functionality that's currently missing.
That said, this requires the user to lock their timezone within the Zed program, which seems less convenient than how jq
and GNU Date provided ways to automatically reflect the system's local time or an alternate timezone specified in TZ
. Environments may want this if they have users spread across multiple timezones running the same Zed programs that all want to see times presented in local format.
If we started from one of the proposals above, perhaps a way of calling the proposed tz
function or a particular value for the proposed string argument could invoke the behavior seen with jq
and GNU Date where it obeys the local system time and overrides that if TZ
is set.
A possible alternative to the "third strftime
parameter" approach might be to offer some kind of CLI option that allows the specification of an alternate timezone/offset (e.g., zq -tz="US/Eastern"
) and have that setting affect strftime
and any other time-centric functionality we may add in the future (e.g., if we wanted to start allow for printing time
literals with offsets rather than just strings via strftime
.) One side effect of this approach is that it could provide a way for users to get the benefit of their TZ
environment variable without the Zed tooling having to explicitly know about it, e.g., if they invoke with zq -ts=$TZ
.
Zui Proposals
Addressing this data presentation topic in Zui is covered separately in https://github.com/brimdata/zui/issues/1057.