FreeTypeAbstraction.jl
FreeTypeAbstraction.jl copied to clipboard
Better font-finding heuristics, with a shortcut and a few caches.
I initially started looking at this package when I found that it doesn't always look in the right directories for fonts (see #82). However, I recently found that the heuristic for font finding itself is both slow (#67) and a bit dodgy (#83).
So, I've gone through the font-finding code and made a few changes:
- The scoring heuristic takes a few more factors into account
- Earlier components of a search string are now weighted slightly higher (e.g. the
plex
inibm plex sans italic
is worth more thansans
) - Particular regular styles will be picked over others (e.g.
regular
overmedium
) - Certain font formats are prioritised (e.g.
otf
overpfb
)
- Earlier components of a search string are now weighted slightly higher (e.g. the
- ~~Introduced a considered shortcut when scoring each font file~~
- ~~The list of matching font-file is pre-sorted according to matches of the search string in the font file name~~
- ~~We calculate the maximum possible family and style score given the search string~~
- ~~We return the current best font early when:~~
- ~~We have found a font that maximum score~~
- ~~We have seen more than twice as many fonts as the last font with the maximum score seen~~
- Introduced a few (runtime) caches
- A cache of the font file names, without the extension, as lowercase
- A cache of the family, style, and extension of font files
- A cache of the resolved font for given searches (invalidated by directory modification times)
From the test that I've performed locally, this batch of changes results in faster, better initial lookups, and faster (again) subsequent lookups.
Here are some test results on my machine:
Search string | Current time | PR time | Current result (family, style) | PR result |
---|---|---|---|---|
ibm plex sans bold italic | 1.5s | 0.06s | AlegreyaSans-BoldItalic, Bold | IBM Plex Sans, Bold Italic |
ibm sans | 1.5s | 0.04s | IBM Plex Sans, Regular | (same) |
sans | 1.5s | 0.02s | KpSans, Regular | DejaVu Sans, Book |
hack | 1.5s | 0.02s | Hack, Regular | (same) |
computer modern | 1.5s | 0.03s | Computer Modern, Roman | Computer Modern, Medium |
schoolbook | 1.5s | 0.03s | Century Schoolbook L, Roman | Adobe New Century Schoolbook, Medium |
medium euler | 1.5s | 0.08s | Alegreya-Medium, Medium | Euler, Medium |
bold slanted roman | 1.5s | 0.04s | Latin Modern Roman Slanted, 10 Bold | (same) |
Subsequent identical searches take ~0.0002s.
If more people could perform adversarial (but not pathological) tests with this scheme, that would be much appreciated.
There was some concern raised about using the file-sorting based shortcut. I've just had a look at instead baking the fontfile sort-info into the precompilation step, and that seems to work a treat.
Codecov Report
Attention: 2 lines
in your changes are missing coverage. Please review.
Comparison is base (
077003e
) 95.26% compared to head (3ce7edf
) 95.53%.
Files | Patch % | Lines |
---|---|---|
src/findfonts.jl | 95.74% | 2 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #84 +/- ##
==========================================
+ Coverage 95.26% 95.53% +0.26%
==========================================
Files 6 6
Lines 317 336 +19
==========================================
+ Hits 302 321 +19
Misses 15 15
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
If we're caching paths at precompilation time, that will pose issues for relocatability I think.. Maybe this should not be done at that point but at init. It could be made async so that it doesn't delay package loading, but then some locking mechanism would need to be in place for usage.
If we're caching paths at precompilation time, that will pose issues for relocatability I think...
Relocatability does make this interesting, but I think we need to be clearer on what the potential "issues" might be. In this case, the primary consequence of a precompiled value that doesn't match the actual system is stale cache entries which will be replaced/ignored at runtime. I.e. this doesn't affect correctness, but might affect some performance considerations.
Do you know of any precompile result relocation happening in practice? It could help to have a more concrete example to discuss this in the context of.
Yes the issue would be that the baked in fonts in the dict probably do not transfer to another system. So you wouldn't gain much speed there.
Do you know of any precompile result relocation happening in practice? It could help to have a more concrete example to discuss this in the context of.
At work, we use sysimages compiled on CI runners and transferred to other runners. There we run into relocatability issues all the time.
Right. So it sounds like what we really want is a per-system font cache file?
Would Scatch.jl
be the way to go then?
@jkrumbiegel any further thoughts on this?
Not really further, I also think using a scratch space might be the reasonable way to go. Store a file with all the names and paths there and only repopulate the info when files change.