FreeTypeAbstraction.jl icon indicating copy to clipboard operation
FreeTypeAbstraction.jl copied to clipboard

findfont picks wrong font

Open tecosaur opened this issue 1 year ago • 5 comments

I was recently thinking of trying something other than a cache to speed up findfont (it takes ~1.5s on my machine), but in the process noticed that it seems to be picking the wrong font??

julia> FreeTypeAbstraction.findfont("ibm plex sans bold italic")
FTFont (family = AlegreyaSans-BoldItalic, style = Bold)

julia> ibm_psbi = FreeTypeAbstraction.try_load("/home/tec/.local/share/fonts/IBMPlexSans-BoldItalic.otf")
FTFont (family = IBM Plex Sans, style = Bold Italic)

julia> FreeTypeAbstraction.match_font(ibm_psbi, ["ibm", "plex", "sans", "bold", "italic"])
(11, 10, false, -24)

julia> alegreya_tex_sbi = FreeTypeAbstraction.try_load("/usr/share/fonts/texlive-alegreya/AlegreyaSans-BoldItalic.pfb")
FTFont (family = AlegreyaSans-BoldItalic, style = Bold)

julia> FreeTypeAbstraction.match_font(alegreya_tex_sbi, ["ibm", "plex", "sans", "bold", "italic"])
(14, 0, false, -27)

tecosaur avatar Jan 29 '24 09:01 tecosaur

Seems like the function I wrote there has some bad edge cases. The problem here is that the family name of AlegreyaSans-BoldItalic contains "sans" "bold" and "italic" which is a better match than just matching "ibm plex sans" in the IBM font. The algorithm prefers the font with the best family name match first, but that's probably not a good idea when I look at these names. A fix could be to add the first two numbers together and just make a joint family plus style score. Then IBM would win with 21 vs 14 points.

What I wanted at the time was that you wouldn't have to exactly spell out "AlegreyaSans-BoldItalic" because that's usually annoying to look up for every font, these exact names are not obvious. But this matching on name parts has its own issues as seen here.

jkrumbiegel avatar Jan 29 '24 09:01 jkrumbiegel

That's exactly the same thought as I had (combining the score, and using the family as the first tiebreak).

The non-cache speed-up I had in mind is that:

  • Most of the time, I think we can expect an exact match (i.e. every part of the "search term list" is matched to the font)
  • When an exact match occurs, it's probably fine to bail out at that point
  • We can sort the list of font files by how closely their filename matches the search term to try to check likely-exact matches first

Might you have any thoughts on this?

tecosaur avatar Jan 29 '24 09:01 tecosaur

When an exact match occurs, it's probably fine to bail out at that point

That's why there's a name length penalty. This is so that some font bold is preferred vs. some font bold italic even though both have an "exact match" for some font bold.

jkrumbiegel avatar Jan 29 '24 09:01 jkrumbiegel

I think the idea of pre-sorting + bailing out should be fine as long as we can assume that font files (even just within the same family's set of files) have a semi-consistent naming scheme.

What I'd like to test is then sorting by matching components then length, also not bailing out until the total match score decreases. I think that should behave pretty well in practice, and supply a dramatic speed-up (Makie's caching is helpful, but even the first time you hit a custom font, a multi-second delay seems a bit much).

tecosaur avatar Jan 29 '24 10:01 tecosaur

Hmmm, looking at the current scheme I think I see one or two other aspects where the matching process falls short. I'll see how my efforts go, I might have a PR for discussion up shortly.

tecosaur avatar Jan 29 '24 10:01 tecosaur