icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

locid: implement Serde for `Locale`

Open tapeinosyne opened this issue 2 years ago • 2 comments

Currently, the serde serialization feature in locid is implemented for LanguageIdentifier only. This basic PR extends it to Locale, using the same (trivial) implementation that defers to string conversion and parsing; tests are similarly basic, but can be expanded as we like.

Motivation

I maintain the hyphenation crate, and would like to:

  • use typed language representations rather than naked BCP-47 strings.
  • use types that are likely to be shared across crates, to minimize dependency creep for downstream users.

Moreover, such language types need to:

  • support Unicode extensions, thus disqualifying LanguageIdentifier.
  • be serializable, for on-disk storage and on-demand loading as part of hyphenation dictionaries.

So, I reckon that ICU4X's Locale is the best candidate for a widely shared language type that suits hyphenation, and it would be nice if it could be serialized without going through newtypes. (I was in fact surprised to discover that LanguageIdentifier was serializable but Locale was not, since the documentation recommends the latter over the former.)

tapeinosyne avatar Oct 03 '21 21:10 tapeinosyne

Notice: the branch changed across the force-push!

  • components/locid/src/serde/langid.rs is now changed in the branch
  • components/locid/src/serde/locid.rs is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Sep 01 '22 06:09 CLAassistant

We now have a document explaining how to do locale serializations via zerovec.

https://icu4x.unicode.org/doc/icu_locid/zerovec/index.html

sffc avatar Nov 10 '22 18:11 sffc