Chinese support,any plan or the way to support Chinese?
use the latest version of ladybird,English display is normal ,but Chinese is not normal?can someone tell the way to support Chinese?thanks.
Hi!
As far as I know, we don't have any programmers actively working on the project who can read or speak any dialect of Chinese. So, we don't regularly test or dogfood any pages using it.
If you can identify which fonts are used by those pages, and what features of OpenType or other font formats we are missing, it would help a lot. It's also possible we are missing features in layout or painting to correctly measure size of each glyph, or to properly draw combined glyphs given a sequence of Unicode codepoints.
Support for loading an additional font for a CJK region of Unicode is also probably not very well tested, nor are encodings using other formats that are not UTF-8.
from the css ,i can identify the font of the page is Arial , my plan is: copy the font from windows into Linux(i use ladybird in ubuntu Linux),but then?can you please tell me the steps to support Chinese ?i think the project is interesting.I view the code in OpenType/Font.cpp ,but still have no idea what to do.
@ADKaster
@doodoocoder We are in the process of removing our custom font rendering code and start using some open source font/graphics libraries. Once we are using third-party font drawing code, it should be easier to figure out where our bugs related to CJK Unicode ranges are.
I suspect that the problems are throughout LibWeb's Layout and Painting modules, where we assume Latin/Western text everywhere.
I expect that we will have the third-party code into the codebase within the next month or so.
The best way to debug a problem like this is to create a small web page with maybe 10 lines of CSS and 5 elements total, to see what we are doing properly and what we are doing improperly. There are probably so many missing features that it will be easy to find one that is fixable in a small PR.
The best way to debug a problem like this is to create a small web page with maybe 10 lines of CSS and 5 elements total, to see what we are doing properly and what we are doing improperly.
I wrote a very small test case at https://www.eagleflow.fi/天道虫 (tentoumushi, ladybird).
It looks like the number of glyphs is correct, and Qt parts like titlebar and tab name render correctly, but the font selection logic doesn't pick a font that has those glyphs included (despite one being available on the system and explicitly requested via font-family).
Since fontconfig is already being used, it could be relatively simple to query for a font with CJK capabilities. The exact font choices would then depend on the user's system configuration.
~ % fc-match sans
NotoSans-Regular.ttf: "Noto Sans" "Regular"
~ % fc-match sans:lang=ja
NotoSansCJK.otf: "Noto Sans CJK JP" "Regular"
Unfortunately there is also Han unification to deal with when rendering CJK glyphs. The same code point can look subtly different depending on selected font and language, and the heuristics of which exact font variant to pick can get quite complex. (Unfortunately the HTML lang attribute is seldom used in real life.)
Unless I'm mistaken this font selection logic needs to exist even in the case where Ladybird would move to Skia & HarfBuzz for font rendering & shaping in the future.
I would be interested in contributing to support Chinese.
Here is the simplest example
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Hello World</title>
</head>
<body>
中文 汉字 zhongwen hanzi
</body>
</html>
To preview this example, first install a CJK font. For example, in Arch Linux, noto-fonts-cjk can be installed.
Firefox Preview (Windows 10)
Support (Check list)
After in-depth research on this issue, solving it may require starting from the following aspects
- Style compute
- Compute the font base on
Chunkcode-points
- Compute the font base on
- OpenType/CMAP find
- Does CJK font has special CMAP format?
- Glyph raster
@eagleflo your test page should also have Arabic and hebrew
@ADKaster After my test (on macOS), the only reason that Ladybird cannot display CJK text properly is that the system font cannot be called correctly.
By using CSS to import the Noto Sans, the CJK text on the web page can be displayed normally.
However, it may be due to the same-origin policy that the web font does not seem to work.
Therefore, I wrote a test web page locally, which can display all CJK texts normally as long as there are corresponding and correct font files in the same directory.
For comparison, this is what it looks like without importing fonts:
The source code of this test page is avaliable in this Gist. You need to download all the fonts I use in this demo through this Google Drive Link or manually download&rename from their GitHub Releases to preview it properly.
In addition, regarding potential other problems (if you are worried that this demo is too simple). Then I can point out that, I saved this encyclopedia page which is more complex, as HTML and downloaded it locally.
Unfortunately, Wikipedia cannot be properly viewd after saving as HTML.
After modifying its HTML to import fonts like the demo just now, all the contents of the page can be browsed normally.
Therefore, in order to quickly and directly support CJK text, the ladybird development team can focus on how to correctly call system fonts. Once this problem is solved, most web pages containing CJK characters should be displayed normally.
Finally, a side note: It's cool to make a browser from scratch that doesn't rely on an existing engine! As a web developer, I'd love to see a new browser that uses an independent engine. This will help break Google's monopoly on browser engine.
Just out of curiosity, does this work on non-macOS operating systems? (I don't have Linux/Windows to test.)