iced icon indicating copy to clipboard operation
iced copied to clipboard

Text shaping and font fallback

Open hecrj opened this issue 4 years ago • 29 comments

wgpu_glyph uses glyph-brush, which in turn uses rusttype. While the current implementation is able to layout text quite nicely, it does not perform any text shaping.

Text shaping with font fallback is a necessary feature for any serious GUI toolkit. It unlocks support to truly localize your application, supporting many different scripts.

The only available library that does a great job at shaping is HarfBuzz, which is implemented in C++. skribo seems to be a nice HarfBuzz wrapper for Rust.

This feature will probably imply rewriting wgpu_glyph entirely, as caching will be more complicated and the API will probably need to ask for more data.

hecrj avatar Oct 23 '19 18:10 hecrj

After the report in #48, I did a bit of research on script-aware font fallback in font-kit and skribo. It looks like it's an unsolved problem for the time being. Here are some interesting links that may be useful in the future:

hecrj avatar Nov 08 '19 20:11 hecrj

After the report in #48, I did a bit of research on script-aware font fallback in font-kit and skribo. It looks like it's an unsolved problem for the time being. Here are some interesting links that may be useful in the future:

* [servo/font-kit#37](https://github.com/servo/font-kit/issues/37)

* [Script matching in `skribo`](https://github.com/linebender/skribo/blob/master/docs/script_matching.md)

* [linebender/skribo#22](https://github.com/linebender/skribo/issues/22)

Recently, it seems that a new player is comming.

allsorts@github

blog

piaoger avatar Nov 22 '19 02:11 piaoger

Something else to follow: a native Rust harfbuzz port, which will unlock vector text in WASM. Both by @RazrFalcon.

dabreegster avatar Apr 03 '20 00:04 dabreegster

Hi! I hope to finish rustybuzz in 2-3 month, which should help a lot. But it doesn't include font fallback. I also need a proper font fallback in resvg, by I have no idea how it should be implemented. Afaik, the are no rules for this. Everyone is doing whatever they want. But it would be great to share some code instead of using an internal implementation.

I also plan to write a font matching library, similar to what fontkit does, but without system libraries.

RazrFalcon avatar Apr 03 '20 19:04 RazrFalcon

Hey @RazrFalcon, really looking forward to rustybuzz.

Afaik, the are no rules for this. Everyone is doing whatever they want. But it would be great to share some code instead of using an internal implementation.

My knowledge when it comes to this is very limited. I shared a couple of links in a previous comment, but I imagine you are probably aware of them already.

I agree having a shared solution would be great. Is the challenge here related to creating a solution that is decoupled from the font parsing/matching library?

I also plan to write a font matching library, similar to what fontkit does, but without system libraries.

Sounds great! Iced would benefit from this too. As of now, font-kit is pulling a bunch of external dependencies—like harfbuzz—even though we do not really shape any text.

Let me know if you have any ideas about how we could help you in your efforts.

hecrj avatar Apr 03 '20 19:04 hecrj

I guess the best one I've saw is: https://raphlinus.github.io/rust/skribo/text/2019/04/04/font-fallback.html But it has more questions than answers.

Also, sort of related: https://lord.io/blog/2019/text-editing-hates-you-too/

Is the challenge here related to creating a solution that is decoupled from the font parsing/matching library?

The biggest problem is the algorithm itself. I'm not an expert either, but afaiu the algorithm involves:

  1. Shaping with a master font.
  2. Checking what glyphs are missing.
  3. Finding a font(s) with those glyphs.
  4. Shaping again.
  5. Now, we have to somehow merge two(or more) shaping results.
    resvg simply replaces glyphs, which is totally incorrect.
    I guess it should be done on a clusters level or per-word.

Also, resvg shapes a whole text chunk (in terms of SVG), but maybe a per-word shaping will be better.

Right now I'm focused only on shaping and font matching, so a font fallback can be implemented on top of them.

Let me know if you have any ideas about how we could help you in your efforts.

Free testers are the best resource :smile:

RazrFalcon avatar Apr 03 '20 20:04 RazrFalcon

As far as I understand there are two main approaches to font fallback:

  • Shaper-driven fallback tha @RazrFalcon described above
  • Using the cmap of fonts to find one that supports all code points in a given grapheme cluster, like in https://drafts.csswg.org/css-fonts/#font-matching-algorithm

Skribo aims to eventually provide one of those (not just wrap Harfbuzz) but as you found it’s not there yet.

SimonSapin avatar Apr 04 '20 10:04 SimonSapin

@SimonSapin I'm not sure it's possible (or at least robust). HarfBuzz uses font-based, script-based text normalization, which checks not only cmap, but other tables too (mainly GDEF and GSUB).

RazrFalcon avatar Apr 04 '20 16:04 RazrFalcon

I noticed, this is on Druid's todo list as well. https://github.com/linebender/druid/issues/883

@RazrFalcon How "ready" is rustybuzz these days?

cryptoquick avatar Dec 03 '20 12:12 cryptoquick

@cryptoquick I hope to publish a pure Rust version this weekends. It's like 98% finished.

RazrFalcon avatar Dec 03 '20 12:12 RazrFalcon

Fantastic news! We'll keep an eye out. If you can, maybe also give us an update here, too, once the release is out.

cryptoquick avatar Dec 03 '20 12:12 cryptoquick

@cryptoquick It was finally published!

RazrFalcon avatar Dec 05 '20 12:12 RazrFalcon

@RazrFalcon Glad to hear! And true to your word! Thanks for the update. Your hard work will unblock many things across the Rust ecosystem. I'm very curious about it's performance (speed, size, memory usage), so I've made an issue for that on rustybuzz.

cryptoquick avatar Dec 05 '20 14:12 cryptoquick

Any updates on this? Is this issue blocked by something else or just waiting for someone to add a dependency to harfbuzz or rustybuzz? Can you share a high-level overview of the steps needed? Might be interested in contributing but this is kinda out of my experience level given my mostly web-based background.

zumoshi avatar Jan 04 '21 13:01 zumoshi

@zumoshi if you want my 2p, this is not a small job. See why I created kas-text (supports shaping but not font fallbacks).

RazrFalcon's work doesn't actually help much here (except to avoid the need to link a C library) since harfbuzz-rs was already available for shaping.

dhardy avatar Jan 04 '21 15:01 dhardy

@dhardy I've been trying out different GUI crates, I really liked the first time experience of using iced. vgtk defaults to gnome theme, needs msys2 and gtk to be installed and needs 30+ dll files to get it to run without forcing the end-user to have msys2 and it still complains about missing mime database. While iced ran out of the box on the first try with only rustup on windows and looks great, is fast, and the syntax examples feel familiar to Vue/Flexbox that I already know. Unfortunately, RTL text is a blocking issue for me.

I appreciate that it is not easy, But if there was something I could do to help this along, it's worth some effort. especially if it helps me learn a thing or two on the way. I've read your article and it seems you've put quite a lot of effort into kas-text intending to make it compatible with wgpu_glyph. Thank you for your time. Feel free to contact me via email if you think the effort of guiding me on what to do is worth it to get some code out of me.

Can kas-text be integrated into iced or are there some pieces missing? even without font fallbacks bidi rendering itself is way better than what we have atm.

zumoshi avatar Jan 04 '21 16:01 zumoshi

needs 30+ dll files […] RTL text is a blocking issue for me.

These two things are not unrelated. One or more of the DLL files are for Pango, which does a lot of work for text layout on top of HarfBuzz’s shaping.

SimonSapin avatar Jan 04 '21 17:01 SimonSapin

@zumoshi kas-text is designed for integration with KAS. It could absolutely be used with Iced, but some parts of the API may not be ideal due to differences between the GUIs (stateful widgets vs not).

Yes, Rustybuzz helps avoid another dependency. I didn't integrate into kas-text yet (see here). Edit: done now, allowing zero non-Rust dependencies.

dhardy avatar Jan 04 '21 17:01 dhardy

I think this project should consider https://github.com/dfrg/swash. I have integrated it into my Neovide project with great success.

Kethku avatar Jul 07 '21 19:07 Kethku

@dhardy

@ zumoshi kas-text is designed for integration with KAS. It could absolutely be used with Iced, but some parts of the API may not be ideal due to differences between the GUIs (stateful widgets vs not).

Hi I currently am writing a chat client so font fallbacks are a must for me to do emoji. Can you give any hint on how to use kas-text with iced or is that also a non simple job? (aka needs changes to iced).

MTRNord avatar Jul 19 '21 22:07 MTRNord

@MTRNord that would be quite a significant change to Iced. Also, KAS-text doesn't support emojis or embedded images yet.

dhardy avatar Jul 20 '21 05:07 dhardy

This issue is opened in 2019 and still being open. Are there some people still concerening this issue? I found a solution for font fallback in neovide if it could be helpful.

lj94093 avatar May 01 '22 15:05 lj94093

Funny thing is: Iced version 0.3 supported Unicode characters. At least I'm able to set button's text to Cyrillic.

And version 0.4 used to support Unicode. I found this commit (https://github.com/iced-rs/iced/tree/15a13a76b4b0534d08afc0328b90267048e41b9d) is allowing me to use Cyrillic and Arabic characters (but not Chinese)... What happened? Why does Iced still use Strings (that support UTF-8 out of the box), but not Unicode?

unreal79 avatar May 12 '22 04:05 unreal79

Ok, I investigated further. Looks like last commit that supports Unicode (Cyrillic) is https://github.com/iced-rs/iced/tree/1e3feee3a36f25d7e2eda231c3e6b895858952c5

Next commit https://github.com/iced-rs/iced/tree/825c7749ff364cf1f7ae5cab0c25f27ed85c7d82 breaks Unicode support.

unreal79 avatar May 12 '22 09:05 unreal79

And... simply adding features = ["default_system_font"] to my Cargo.toml file fixed Unicode support for me...

unreal79 avatar May 12 '22 09:05 unreal79

Funny thing is: Iced version 0.3 supported Unicode characters. At least I'm able to set button's text to Cyrillic.

And version 0.4 used to support Unicode. I found this commit (https://github.com/iced-rs/iced/tree/15a13a76b4b0534d08afc0328b90267048e41b9d) is allowing me to use Cyrillic and Arabic characters (but not Chinese)... What happened? Why does Iced still use Strings (that support UTF-8 out of the box), but not Unicode?

UTF-8 is a compressed version of Unicode. This is where the Unicode Char that is a 32bit variable is then transformed into a [u8] pattern based on the symbol used. For example, English symbols would fit into a single u8 while Japanese might only fit into [u8;3]. This is done to reduce the storage space needed to store certain Symbols which are more heavily used than other symbols.

Also yes for some reason he disabled the default_system_font from being set on default.

genusistimelord avatar May 12 '22 11:05 genusistimelord

And... simply adding features = ["default_system_font"] to my Cargo.toml file fixed Unicode support for me...

It sounds like the problem if your case was not Unicode support, but lacking a font that contains the relevant glyphs. In other words this was tofu, not mojibake.

SimonSapin avatar May 12 '22 14:05 SimonSapin

Thanks for the information about tofu/mojibake. In my defense I may add that my OS (Win 10) is fully Cyrillic and I don't expect to see tofu or mojibake characters anywhere.

My concern is the disabled default_system_font. How can an Iced newbie know about non-standard font/glyphs been used by default?! It isn't mentioned in documentation. AFAIK adding the feature in Cargo.toml is the only way to actually use system font (setting default_font: None in application Settings didn't help, contrary to what docs say). I suspect many potential users abandoned Iced only because they couldn't use non-latin characters.

unreal79 avatar May 13 '22 07:05 unreal79

@unreal79 PRs welcome :wink:

hecrj avatar May 13 '22 16:05 hecrj

Is there any alternative to display coloured emoji? The only thing that comes to my mind is to have images and displaying them as icons but it's not feasible in my case (I need to display all the countries flags so it would not be a solution to have 200+ images around my code)

GyulyVGC avatar Dec 01 '22 23:12 GyulyVGC