font-kit icon indicating copy to clipboard operation
font-kit copied to clipboard

Default Caseless Matching of possibly-localized family names

Open SimonSapin opened this issue 6 years ago • 2 comments

Servo would like to implement CSS on top of font-kit. The spec (relevant bits below):

  • Is very specific about (case-insensitive) string comparison for font family names
  • Requires accepting alternative (localized) family names of any given font

Source::select_family_by_name looks like the appropriate API for looking up system fonts matching a given family name string. Some research is needed, to figure out whether the underlying APIs (fontconfig/CoreText/DirectWrite) of the respective default Source for each supported platform behaves as CSS requires on these two points, or if can be configured to do so.

An alternative could be to enumerate all available fonts and their family names, and maintain a Rust (hash?) map where keys are normalized with default case fold. But I suspect this would have significant startup cost.


https://drafts.csswg.org/css-fonts/#font-family-casing

User agents must match these names case insensitively, using the "Default Caseless Matching" algorithm outlined in the Unicode specification [UNICODE]. This algorithm is detailed in section 3.13 entitled "Default Case Algorithms". Specifically, the algorithm must be applied without normalizing the strings involved and without applying any language-specific tailorings. The case folding method specified by this algorithm uses the case mappings with status field ‘C’ or ‘F’ in the CaseFolding.txt file of the Unicode Character Database [UNICODE].

[…]

Implementors should take care to verify that a given caseless string comparison implementation uses this precise algorithm and not assume that a given platform string matching routine follows it, as many of these have locale-specific behavior or use some level of string normalization [UAX15].

https://drafts.csswg.org/css-fonts/#font-style-matching

On systems containing fonts with multiple localized font family names, user agents must match any of these names independent of the underlying system locale or platform API used.

https://drafts.csswg.org/css-fonts/#family-name-value

Some font formats allow fonts to carry multiple localizations of the family name. User agents must recognize and correctly match all of these names independent of the underlying platform localization, system API used or document encoding:

SimonSapin avatar Sep 12 '19 13:09 SimonSapin

Yeah, I think that font-kit should match on all names just as CSS does. If we have to maintain some sort of in-memory cache, let's do so. @raphlinus suggested to me that such a cache would probably not be too expensive.

pcwalton avatar Sep 13 '19 17:09 pcwalton

An alternative could be to enumerate all available fonts and their family names, and maintain a Rust (hash?) map where keys are normalized with default case fold. But I suspect this would have significant startup cost.

It looks like Servo already does this (with to_lowercase instead of default case folding) so maybe it’s viable?

https://github.com/servo/servo/blob/5f55cd5d71df9c555fbc24777168396ddd539f28/components/gfx/font_cache_thread.rs#L134

SimonSapin avatar Feb 05 '20 10:02 SimonSapin