enable
enable copied to clipboard
Provide better support for mixed writing systems
This is mainly a problem of the AGG backends. Quartz and QPainter backends already have the correct behavior.
Basically, user code should be able to call show_text on the following string: "Kiva Graphics一番😎" and have it render correctly even if the currently selected font only supports Latin characters.
To get to this point, we need to do a few things:
- [x] Collect writing system information for entries in our font database
- [x] Build fallback lists for font families and styles
- [x] Find a library to use, or failing that, write our own function which breaks a string up into chunks which share the same writing system (see: https://stackoverflow.com/questions/9868792/find-out-the-unicode-script-of-a-character)
- [x] Make low level text drawing functions return the text cursor position after drawing a run of glyphs (or work around the absence by calling
get_text_extenton every chunk of a string before drawing) - [ ] Bring everything together in the
show_textmethod so that mixed strings can be drawn - [ ] Bonus: Support bidirectional text mixing
This is roughly what Qt does, based on a quick skim of the code: https://code.qt.io/cgit/qt/qtbase.git/
QFreeTypeFontDatabase::addTTFile(qtbase.git/tree/src/gui/text/freetype/qfreetypefontdatabase.cpp): Scans a font for the following information: weight, style, fixed-width, supported writing systems (unicode range, codepage range), family nameQPlatformFontDatabase::fallbacksForFamily(qtbase.git/tree/src/gui/text/qfontdatabase.cpp): Takes a style and script ID and returns a list of fonts which support that script with that style (or just support the script)QPainter::drawText(qtbase.git/tree/src/gui/painting/qpainter.cpp): Basically Qt'sshow_text. UsesQStackTextEnginefor shaping, breaking of input string. Breaks intoQScriptItemobjects. Picks the font per item and draws it.QStackTextEngine/QScriptItem/QTextItemInt(qtbase.git/tree/src/gui/text/qtextengine.cpp) These are the components which break up a string into chunks which can be shaped and drawn as a unit.
Lovely: https://raphlinus.github.io/rust/skribo/text/2019/04/04/font-fallback.html
Copying from #767 so it's easier to find:
Having played with [mapping of "Han" to a CJK language] a bit more, we should only use [the locale-based guess] when it's not otherwise clear from the context. For instance if a string already contains Hiragana or Katakana, then Han should be mapped to "Japanese". If Hangul is encountered, Han maps to "Korean". Only if the Han is mixed with some non-CJK language should we fall back to this locale-based guess.
Consider libgrapheme or utf8proc for classifying graphemes in a string.