appstream-glib icon indicating copy to clipboard operation
appstream-glib copied to clipboard

Inconsistency results for font styles from appstream-builder

Open tagoh opened this issue 11 months ago • 3 comments

Currently appstream-builder seems picking up a first entry from query of the packages in the alphabetical order though, this affects a screenshot inconsistently.

$ rm -rf /tmp/logs /tmp/appstream /tmp/repodata || :; mkdir /tmp/logs; appstream-builder --verbose --max-threads=8 --log-dir=/tmp/logs --packages-dir=./google-noto-fonts/noarch --temp-dir=/tmp/appstream --output-dir=/tmp/repodata
<snip>
Done!
$ zgrep summary /tmp/repodata/example.xml.gz | wc -l
294
$ zgrep summary /tmp/repodata/example.xml.gz | grep Black | wc -l  
63
$ zgrep summary /tmp/repodata/example.xml.gz | grep Regular | wc -l
197
$ zgrep summary /tmp/repodata/example.xml.gz | grep Bold | wc -l
30
$ zgrep summary /tmp/repodata/example.xml.gz | grep Light | wc -l
1
$ zgrep summary /tmp/repodata/example.xml.gz | grep Italic | wc -l
4

Shouldn't be the first attempt "Regular"?

tagoh avatar Feb 20 '25 04:02 tagoh

Yes, that makes sense. I think we tried to do that here https://github.com/hughsie/appstream-glib/blob/main/libappstream-builder/plugins/asb-plugin-font.c#L467 but that might be broken. Can you investigate and see if it's still valid? Thanks.

hughsie avatar Feb 20 '25 08:02 hughsie

Unfortunately there seems some misunderstanding in the code.

  1. The most of fonts which follows OpenType specification has TT_PLATFORM_MICROSOFT as the platform_id and TT_MS_ID_UNICODE_CS as the encoding_id. Strings in this type of fonts is encoded as UTF-16BE. So you can't get a proper result from g_locale_to_utf8.
  2. Most of the non-Regular fonts (in the sense of that the weight is something other than 400 in the opentype spec) also has Regular in SubFamily as a pair of the family name like FamilyName SubFamily such as Noto Sans Light, while they have Light for Noto Sans.
  3. In OpenType fonts, the plain family name is more likely to be stored as TT_NAME_ID_WWS_FAMILY(21) and TT_NAME_ID_TYPOGRAPHIC_FAMILY(16) than TT_NAME_ID_FONT_FAMILY(1). In this case, TT_NAME_ID_FONT_FAMILY will be something like "Family SubFamily" style of the name and TT_NAME_ID_FONT_SUBFAMILY(2) will be "Regular". You need to use TT_NAME_ID_TYPOGRAPHIC_SUBFAMILY(17) or TT_NAME_ID_WWS_SUBFAMILY(22) as a pair of them.
$ FC_DEBUG=384 fc-query /usr/share/fonts/google-noto/NotoSerif-Black.ttf
FC_DEBUG=384

found family (n 16 p 3 e 1 l 0x0409)Noto Serif
found family (n  1 p 3 e 1 l 0x0409)Noto Serif Black
found full   (n  4 p 3 e 1 l 0x0409)Noto Serif Black
found style  (n 17 p 3 e 1 l 0x0409) Black
found style  (n  2 p 3 e 1 l 0x0409) Regular
...

It is complicated thing. I'd recommend to pick them up from fontconfig since Appstream already has a dependency to fontconfig. I can work on it if this idea is acceptable.

tagoh avatar Feb 25 '25 05:02 tagoh

Unfortunately there seems some misunderstanding in the code.

Hugely probably; my grasp on font internals is rudimentary at best.

I can work on it if this idea is acceptable.

Yes please!

hughsie avatar Feb 25 '25 07:02 hughsie