icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

Create data provider for an ideal components bag

Open jedel1043 opened this issue 3 years ago • 4 comments

Closes #1318

This follows the Provider data format JSON as closely as possible.

There's a lot of work to be done still, but I'm opening a draft to seek for some comments on the structure of the provider.

Some thoughts:

  • Would be good to have a concept of a generic Pattern (not to be confused with GenericPattern) restricted to a subset of FieldSymbols. That way, e.g. year_month_day could be restricted to a Pattern with only the Year, Month and Day symbols, which should make parsing a bit easier.
  • preferred_hour_cycle is already provided by the current version of CLDR, right?
  • time_zone is still a WIP, since I'm not sure what should be the underlying data needed for that.
  • Is there a need for glue in time formats?
  • Always open for better name proposals 😅
  • The proposed JSON includes "mixed" patterns with format fields + placeholders. I'm creating an entirely new pattern struct to handle this, but there's a possibility of unifying Pattern, GenericPattern and this new MixedPattern in a single struct.

jedel1043 avatar Sep 11 '22 19:09 jedel1043

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Sep 11 '22 19:09 CLAassistant

cc @zbraniecki

jedel1043 avatar Sep 20 '22 17:09 jedel1043

Suggestion: Maybe do this in a way that we can use MixedPatternItem also for regular PatternItem so we don't have so much code duplication. PatternItem doesn't need Placeholder(u8). If it encounters one at runtime, it can either error out or substitute some other token in its place.

I think MixedPatternItem is more akin to GenericPatternItem than to PatternItem, since it should eventually support something like the GenericPattern::combined operation. Also, I have some ideas on how to simplify some of the code to reduce duplication :)

jedel1043 avatar Sep 20 '22 17:09 jedel1043

Notice: the branch changed across the force-push!

  • Cargo.lock is different
  • components/datetime/src/pattern/item/mixed.rs is different
  • components/datetime/src/pattern/reference/display.rs is different
  • components/datetime/src/pattern/runtime/mixed.rs is different
  • provider/datagen/Cargo.toml is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot