icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

Support (loose) string-to-property-map matching in `icu_properties`

Open skius opened this issue 2 years ago • 3 comments

icu_properties does not support obtaining a CodePointMapData (or CodePointSet) from the string of a property name for many properties. It's supported for ECMA-262 binary properties through load_for_ecma_262_unstable, but, e.g., load_word_break cannot be accessed given a string; the client would have to perform property-name-to-icu_properties-functions mapping themselves. Loose matching should also be supported.

UnicodeSet could also support more properties with those features.

skius avatar Jun 21 '23 13:06 skius

Discussion: revisit when we have a clearer picture of the set of properties needed by the transliterator.

Discuss with:

  • @skius
  • @echeran
  • @Manishearth
  • @sffc

Optional:

  • @robertbastian

sffc avatar Jun 22 '23 17:06 sffc

Yep, when I added the ecma402 function my hope was to potentially add this one as well but we decided to wait for a concrete use case. It does seem like we have one now!

Manishearth avatar Jun 22 '23 17:06 Manishearth

@makotokato added some additional properties needed for Segmenter data in #4175

sffc avatar Oct 18 '23 18:10 sffc