icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

Support retrieval of the host system time zone as icu_timezone::CustomTimeZone

Open hsivonen opened this issue 2 years ago • 9 comments

Instantiating an icu_timezone::CustomTimeZone that represents the current time zone of the host system is an operation that is likely to be commonly needed by apps while requiring special expertise (as well as access to alias data) to implement correctly. Therefore, it would be natural for ICU4X to provide this operation.

AFAICT, ICU4C provides this but both Chromium and Firefox work around https://unicode-org.atlassian.net/browse/ICU-13694 , so ICU4X should be written not to need a workaround.

(Since this operation is inherently both operating system-dependent and operating system-specific, it obviously can't be an inseparable part of icu_timezone but either needs to be omissible using cargo features or needs to be in its own crate so that icu_timezone continues to work in a mere core+alloc context.)

hsivonen avatar Feb 01 '23 09:02 hsivonen

We have std features on all crates, so it can be conditional on that.

I'm not sure this method should be tz-only though. Maybe we should have get_system_locale, that encapsulates all system information (language, region, hour format, time zone, etc.). This allows us to keep the system interface as small as possible, and makes it easy to use locales from other source instead. If we had special system constructors on higher levels (like for tz, datetime, etc.), I think it would get messy.

robertbastian avatar Feb 03 '23 09:02 robertbastian

Is our Locale object sufficient for the data? Is BCP47 -u-tz robust enough to capture possible situations with host time zone? AFAICT, -u-tz takes a time zone id and cannot fall back to UTC offset.

In general, though, of the system information, BCP47 does not encapsulate the short date format that is user-settable on Microsoft and Apple systems. It's unclear to me if actual use cases go beyond overriding the locale default to yyyy-MM-dd. That is, it's unclear to me if a boolean flag (to force yyyy-MM-dd) as a -u- extension would sufficiently cover use cases.

hsivonen avatar Feb 03 '23 11:02 hsivonen

@zbraniecki for discussion of retrieval of preferences from the OS

sffc avatar Feb 04 '23 01:02 sffc

Discuss with:

  • @sffc
  • @zbraniecki
  • ~@nordzilla~
  • @Manishearth
  • @leftmostcat

Optional:

  • @hsivonen

sffc avatar May 11 '23 18:05 sffc

The discussion should center around the general location of the code and its behavior. Details of API should be proposed by the person who takes this issue.

sffc avatar Mar 11 '24 13:03 sffc

Discussion in the ICU4X-WG call:

  • @zbraniecki - I'll try to land icu_preferences by the weekend so that it can be ready for review and landed in time for the GSoC contributor.
  • @zbraniecki - My position is that it is the responsibility of the client to use the crate to load the system preferences and merge it.
  • @robertbastian - I'm aligned that it should be a separate crate.
  • @sffc - I definitely think we should never implicitly fall back to system locale.
  • @zbraniecki - An interesting part of the project is how you resolve differences between how different OSes handle preferences differently. Windows, macOS, Linux/POSIX. There are interesting technical questions. I've worked on this previously in Firefox.
  • @sffc - It sounds like @zbraniecki is by far the most well-equipped to mentor the project from the technical side.
  • @zbraniecki - I'm happy to do that and happy to co-mentor.
  • @robertbastian - I can be a co-mentor, once I understand the responsibilities better.
  • @echeran - I'll help out where needed with mentorship.

sffc avatar May 02 '24 18:05 sffc

Some discussion:

  • Ashutosh: My initial approach was to use maps to collect the different settings. I made a mapping in Linux for LC_COLAT, LC_ALL, etc. But on Windows, if I want to get the calendars, there's a variable about global preferences. How would I change the keys of the HashMap for this to be better? I could also have the keys in a generic format.
  • @zbraniecki - You're exploiring a divergense of how OSes store locales. The POSIX model is incompatible with modern OSes. What I'd suggest is that we have a separate deep dive on this. But for the HashMap, you have two taxonomies to think about. Either you represent the OS taxonomy, which is different between Mac, Linux, Windows, ..., or you use ICU components as keys, and say things like "here is a Windows locale for plural rules". The granularity per-component is higher because in many cases you will have the same locale for 20 components. One locale per component is more precise. A key being LC_COLAT means that the programmer needs to do the mapping.
  • @zbraniecki - The second problem is whether we really want different locales. POSIX is the only platform that makes the idea of different locales for different components as a primary feature. Most OSes now let users select a single locale. The idea that you want dates in Polish and numbers in French is exceptionally rare, and my experience at Mozilla was that it caused more confusion. What we did at Mozilla was ignore the POSIX and just assume that all components will use the same locale. Also, modern OSes give a list of locales, but LC_COLAT is only a single locale.
  • @sffc - I imagine that this would be an option on a locale factory object.
  • @zbraniecki - Overall I would suggest focusing on Android or macOS.

sffc avatar Jun 07 '24 07:06 sffc