icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

Generic Partial Location

Open sffc opened this issue 1 year ago • 22 comments
trafficstars

Generic Partial Location is supposed to be used when the offsets for a zone are ambiguous.

There isn't a clear algorithm in the spec on how to determine this, but I assume one would be roughly:

  1. Compute the offset from the TZDB for the golden zone of the metazone in the current region
  2. If the given offset differs from the computed offset, use Generic Partial Location
  3. Otherwise, if the offsets match, use Generic Non-Location

For example: It is July 1 in America/Phoenix. The metazone is America_Mountain. The golden zone is America/Denver. The offset for America/Denver on July 1 is -6. However, America/Phoenix has an offset of -7. Therefore, use Generic Partial Location.

However, we can't do this type of calculation at formatting time, since it definitely needs the TZDB.

Could we do it via ZoneVariant?

  • ZoneVariant::Standard
  • ZoneVariant::StandardStrange
  • ZoneVariant::Daylight
  • ZoneVariant::DaylightStrange

and if "StandardStrange" or "DaylightStrange" are used, then we print Generic Partial Location instead of Generic Non-Location when a Generic format is requested.

I think we could probably make the zone variant calculator figure this out, but I haven't thought through all the edge cases yet.

@robertbastian

sffc avatar Oct 02 '24 21:10 sffc

I'm not sure if the zone variant calculator is enough. It is only supposed to know about when a time zone switched between metazone or zone variants with a broad stroke, not time zone transitions. I don't see how it could answer the question, "does this time in America/Phoenix differ from the time in America/Denver?" Unless there is a way? I hope I'm missing something.

sffc avatar Oct 02 '24 21:10 sffc

The only case where this can happen is if a zone has a different DST offset than the golden zone: https://unicode.org/reports/tr35/tr35-dates.html#goals:~:text=Except%20for%20daylight%20savings%2C%20at%20any%20given%20time%2C%20all%20zones%20in%20a%20metazone%20have%20the%20same%20offset%20at%20that%20time.

We can add a non_standard_dst flag to MetazonePeriodsV1, and then if zone_variant == ZoneVariant::daylight() && metazone.non_standard_dst -> use generic partial location

robertbastian avatar Oct 02 '24 21:10 robertbastian

Oh, hmm, yeah that might work, although it would mean that you get "Mountain Time (Denver)", so basically one bad apple spoils the format for all time zones in that metazone.

sffc avatar Oct 02 '24 21:10 sffc

Ah actually America/Phoenix will never have ZoneVariant::daylight(). We would need to add the flag to ZoneOffsetPeriodsV1, so that we can determine whether the zone variant is ~standard~ regular.

robertbastian avatar Oct 02 '24 21:10 robertbastian

Oh, hmm, yeah that might work, although it would mean that you get "Mountain Time (Denver)", so basically one bad apple spoils the format for all time zones in that metazone.

America/Denver is the golden zone, so it itself will not be considered non-standard.

robertbastian avatar Oct 02 '24 21:10 robertbastian

Maybe we just mark America/Phoenix as "strange" (within a certain period of time) and print "Mountain Time (Phoenix)" all year round?

sffc avatar Oct 02 '24 21:10 sffc

We probably shouldn't be the ones deciding this. Maybe bring it to CLDR Design WG.

sffc avatar Oct 02 '24 21:10 sffc

Maybe we just mark America/Phoenix as "strange" (within a certain period of time) and print "Mountain Time (Phoenix)" all year round?

It would have to be all year round, because without the TZDB we cannot tell whether the golden zone is currently observing DST. We can only tell if an input zone is observing DST because we get an input offset, for the golden zone there is no way for us to get this information.

We probably shouldn't be the ones deciding this. Maybe bring it to CLDR Design WG.

CLDR wants us to use the TZDB for this.

robertbastian avatar Oct 02 '24 21:10 robertbastian

We probably shouldn't be the ones deciding this. Maybe bring it to CLDR Design WG.

CLDR wants us to use the TZDB for this.

CLDR only says: "when the generic non-location format is not specific enough" which very non-specific.

sffc avatar Oct 02 '24 22:10 sffc

If TZDB is really the only way to do this, it's not completely out of the picture to make our own annotation in IXDTF. It is allowed in the spec for annotations to be extensible. Something like

2024-10-02T15:05:51-0700[America/Phoenix][u-td=yes]

where u-td=yes means roughly "Unicode Time zone Disambiguation: Yes"

sffc avatar Oct 02 '24 22:10 sffc

CLDR wants us to use the TZDB for this.

CLDR only says: "when the generic non-location format is not specific enough" which very non-specific.

* I assume CLDR will tell us to use TZDB for this.

If TZDB is really the only way to do this, it's not completely out of the picture to make our own annotation in IXDTF.

Not a fan of this. This would require it to be an additional field of whatever we settle on in #5533.

robertbastian avatar Oct 03 '24 09:10 robertbastian

A question Mark asked last time we briefly discussed this, IIRC, was "check what ICU4C does". I haven't checked yet but we should do that.

sffc avatar Oct 03 '24 11:10 sffc

ICU requires a full TZDB, so they can do the clever thing. We don't want to ship a full TZDB.

robertbastian avatar Oct 03 '24 11:10 robertbastian

ICU requires a full TZDB, so they can do the clever thing. We don't want to ship a full TZDB.

Yes, but what are they actually doing?

sffc avatar Oct 03 '24 12:10 sffc

Not a fan of this. This would require it to be an additional field of whatever we settle on in #5533.

If the condition isn't derivable from offset plus time zone identity, it's going to need to go somewhere in the input schema.

sffc avatar Oct 03 '24 12:10 sffc

Only if we want to do the fancy thing of only giving Phoenix special treatment while Denver is observing DST. If we special-case it year-round that can be derived from the time zone identity.

robertbastian avatar Oct 03 '24 12:10 robertbastian

Well, even in that case, don't we still need it in the input schema? Either as a unique zone variant, metazone, or additional field?

sffc avatar Oct 03 '24 15:10 sffc

Well, even in that case, don't we still need it in the input schema? Either as a unique zone variant, metazone, or additional field?

sffc avatar Oct 03 '24 15:10 sffc

We can mark America/Phoenix as weird in the metazone lookup. We have spare bits in the ASCII encoding. For example we could encode Ammt instead of ammt for "belongs to metazone, but needs partial location format".

robertbastian avatar Oct 03 '24 16:10 robertbastian

I'm assuming we land on the approach in https://github.com/unicode-org/icu4x/issues/5533 where the metazone and zone variant remain calculated by icu_timezone, meaning this would still need to be in the interface between FormattableTimeZone and datetime::Formatter

sffc avatar Oct 03 '24 16:10 sffc

This is actually way more complex: https://unicode.org/reports/tr35/tr35-dates.html#Contents:~:text=Otherwise%20do%20the,Pacific%20Time%20(Whitehorse)%22

robertbastian avatar Oct 04 '24 12:10 robertbastian

Hmm, I must have overlooked that algorithm in the spec. Good news though is that it is based on time zone identity rather than TZDB.

sffc avatar Oct 04 '24 15:10 sffc

Bad news is that it also depends on the locale's region.

robertbastian avatar Jan 20 '25 16:01 robertbastian