FreeTypeAbstraction.jl icon indicating copy to clipboard operation
FreeTypeAbstraction.jl copied to clipboard

findfont scans all fonts every time

Open ederag opened this issue 3 years ago • 13 comments

findfont works well, but takes several seconds - which may be considered fast, considering - here (openSUSE Leap-15.3), but led to the significant system-dependent overhead found in https://github.com/cesaraustralia/DynamicGrids.jl/issues/194#issuecomment-995176775.

findfont is always scanning all fonts, opening them to get their name and properties and find the best one.

The issue is that on my system there were over 10_000 fonts installed, most quick to load (median about 7.5 µs), but some taking much longer (see the diagnosis histogram), so the average is about 270 µs.

diagnosis
using FreeTypeAbstraction: fontpaths, try_load, match_font
font_folders = copy(fontpaths())

t1_ = Float64[]
path_ = String[]
@time for folder in font_folders
           for font in readdir(folder)
               fpath = joinpath(folder, font)
               
               t1 = @elapsed face = try_load(fpath)
               push!(t1_, t1)
               push!(path_, fpath)
               face === nothing && continue
               finalize(face)
           end
       end

  2.855721 seconds (159.18 k allocations: 23.692 MiB)

julia> sum(t1_)
2.832251035

julia> using UnicodePlots
julia> histogram(t1_)
                  ┌                                        ┐ 
   [0.0  , 0.001) ┤█████████████████████████████████  10063  
   [0.001, 0.002) ┤▌ 151                                     
   [0.002, 0.003) ┤▍ 83                                      
   [0.003, 0.004) ┤▎ 32                                      
   [0.004, 0.005) ┤▏ 4                                       
   [0.005, 0.006) ┤▏ 14                                      
   [0.006, 0.007) ┤▏ 7                                       
   [0.007, 0.008) ┤▏ 15                                      
   [0.008, 0.009) ┤▏ 9                                       
   [0.009, 0.01 ) ┤▏ 1                                       
   [0.01 , 0.011) ┤▏ 3                                       
   [0.011, 0.012) ┤  0                                       
   [0.012, 0.013) ┤▏ 7                                       
   [0.013, 0.014) ┤▏ 1                                       
   [0.014, 0.015) ┤▏ 1                                       
                  └                                        ┘ 
                                   Frequency                

julia> using Statistics
julia> mean(t1_)
0.00027256770618804736
julia> median(t1_)
7.4702e-5

There are several solutions on the other side (e.g. removing slow fonts, using FTFont(font_path) to load the specific font directly).

But it might be nice to have a kind of cache so that findfont(font_name) would be even faster on subsequent calls ?

ederag avatar Dec 31 '21 16:12 ederag

Thanks for looking into this. DynamicGrids could allow passing in the FTFont directly, but a cache seems like a cleaner solution, and maybe better if we dont all implement that.

Can we just move the cache in Makie here? How easy would it be to just copy the code over? @SimonDanisch @jkrumbiegel

rafaqz avatar Jan 04 '22 07:01 rafaqz

Agreed that centralizing would be better. I opened a discussion on discourse because it might involve Fontconfig.jl as well (it's very fast), while Fontconfig.jl and FreeTypeAbstraction.jl should probably not depend on one another.

ederag avatar Jan 04 '22 08:01 ederag

Sounds like a good idea to make findfont as fast as possible ^^

SimonDanisch avatar Jan 04 '22 10:01 SimonDanisch

Yes depending on FontConfig.jl here does look like the best solution. Im happy to review a PR for this.

rafaqz avatar Jan 04 '22 13:01 rafaqz

If there is an agreement about making this package dependent on FontConfig.jl, then I might be able to create a PR, hopefully this week-end.

fontconfig syntax differs from the current findfont one. So it would be safer to add the new function load_font described on discourse, and only after a while, deprecate the current findfont that has proven reliable ?

ederag avatar Jan 04 '22 14:01 ederag

Yes depending on FontConfig.jl here does look like the best solution.

So I tried pretty hard to use FontConfig instead of rolling our own font search, but I couldn't get it to work reliable on all platforms. To be honest, I don't really remember the problems anymore, I just remember that I gave up and thought, that a simple findfont would be easier and more reliable.

SimonDanisch avatar Jan 05 '22 10:01 SimonDanisch

So I tried pretty hard to use FontConfig instead of rolling our own font search, but I couldn't get it to work reliable on all platforms.

https://github.com/JuliaGraphics/Fontconfig.jl/issues/21 (solved few months after the first findfont commit: 406441d83d6e11a8bc3ad994364b904f95bce7a1) and https://github.com/JuliaGraphics/Fontconfig.jl/issues/8 have been solved, but that remains a valid concern, as there are two opened issues, both about installation (https://github.com/JuliaGraphics/Fontconfig.jl/issues/12 and https://github.com/JuliaGraphics/Fontconfig.jl/issues/30). They looked specific to certain configurations, but still good to have in mind.

ederag avatar Jan 05 '22 11:01 ederag

Seems like there is at least BinaryBuilder now...But:

https://github.com/JuliaGraphics/Fontconfig.jl/pull/31#issuecomment-585247957

And I guess someone will need to maintain Fontconfig.jl and update the CIs etc...

SimonDanisch avatar Jan 05 '22 11:01 SimonDanisch

This seems relevant for https://github.com/JuliaLang/julia/pull/47184#issuecomment-1364028015. I inserted some debugging code:

$ git diff
diff --git a/src/findfonts.jl b/src/findfonts.jl
index 0b668a5..fabdc2e 100644
--- a/src/findfonts.jl
+++ b/src/findfonts.jl
@@ -136,9 +136,12 @@ function findfont(

     best_score_so_far = (0, 0, false, typemin(Int))
     best_font = nothing
+    @show font_folders
+    nfonts = 0

     for folder in font_folders
         for font in readdir(folder)
+            nfonts += 1
             fpath = joinpath(folder, font)
             face = try_load(fpath)
             face === nothing && continue
@@ -168,6 +171,7 @@ function findfont(
             end
         end
     end
+    @show nfonts best_font

     return best_font
 end

and got this output:

julia> @time using CairoMakie
 11.008582 seconds (18.42 M allocations: 1.153 GiB, 5.60% gc time, 0.54% compilation time)

julia> @time @eval scatter(0..1, rand(10), markersize=rand(10) .* 20)
font_folders = ["/home/tim/.julia/packages/Makie/Ggejq/assets/fonts", "/usr/share/fonts", "/usr/share/fonts/X11", "/usr/share/fonts/X11/Type1", "/usr/share/fonts/X11/encodings", "/usr/share/fonts/X11/encodings/large", "/usr/share/fonts/X11/misc", "/usr/share/fonts/X11/util", "/usr/share/fonts/cMap", "/usr/share/fonts/cmap", "/usr/share/fonts/cmap/adobe-cns1", "/usr/share/fonts/cmap/adobe-gb1", "/usr/share/fonts/cmap/adobe-japan1", "/usr/share/fonts/cmap/adobe-japan2", "/usr/share/fonts/cmap/adobe-korea1", "/usr/share/fonts/opentype", "/usr/share/fonts/opentype/urw-base35", "/usr/share/fonts/truetype", "/usr/share/fonts/truetype/ancient-scripts", "/usr/share/fonts/truetype/dejavu", "/usr/share/fonts/truetype/droid", "/usr/share/fonts/truetype/lato", "/usr/share/fonts/truetype/noto", "/usr/share/fonts/truetype/unifont", "/usr/share/fonts/type1", "/usr/share/fonts/type1/gsfonts", "/usr/share/fonts/type1/texlive-fonts-recommended", "/usr/share/fonts/type1/urw-base35", "/home/tim/.local/share/fonts", "/usr/local/share/fonts"]
nfonts = 1730
best_font = FTFont (family = TeX Gyre Heros Makie, style = Regular)
font_folders = ["/home/tim/.julia/packages/Makie/Ggejq/assets/fonts", "/usr/share/fonts", "/usr/share/fonts/X11", "/usr/share/fonts/X11/Type1", "/usr/share/fonts/X11/encodings", "/usr/share/fonts/X11/encodings/large", "/usr/share/fonts/X11/misc", "/usr/share/fonts/X11/util", "/usr/share/fonts/cMap", "/usr/share/fonts/cmap", "/usr/share/fonts/cmap/adobe-cns1", "/usr/share/fonts/cmap/adobe-gb1", "/usr/share/fonts/cmap/adobe-japan1", "/usr/share/fonts/cmap/adobe-japan2", "/usr/share/fonts/cmap/adobe-korea1", "/usr/share/fonts/opentype", "/usr/share/fonts/opentype/urw-base35", "/usr/share/fonts/truetype", "/usr/share/fonts/truetype/ancient-scripts", "/usr/share/fonts/truetype/dejavu", "/usr/share/fonts/truetype/droid", "/usr/share/fonts/truetype/lato", "/usr/share/fonts/truetype/noto", "/usr/share/fonts/truetype/unifont", "/usr/share/fonts/type1", "/usr/share/fonts/type1/gsfonts", "/usr/share/fonts/type1/texlive-fonts-recommended", "/usr/share/fonts/type1/urw-base35", "/home/tim/.local/share/fonts", "/usr/local/share/fonts"]
nfonts = 1730
best_font = FTFont (family = TeX Gyre Heros Makie, style = Bold)
 16.750444 seconds (185.45 k allocations: 11.756 MiB, 1.04% compilation time)
FigureAxisPlot()

I think you could add a fonts.jl to Makie that basically does this:

const best_regular = findfonts(...)
const best_bold = findfonts(...)

and then the choice would be precompiled. (You wouldn't call it at runtime at all.)

timholy avatar Dec 23 '22 15:12 timholy

~~In addition to finding default fonts at compile time as proposed by @timholy, one could cache fonts instead of looking up the font on every to_font invocation.~~ ~~This is the caching mechanism I wrote for UnicodePlots.~~

NVM, fonts are already cached.

t-bltg avatar Dec 23 '22 20:12 t-bltg

I guess a simple optimization could be just saving the list of font names for all found files in a text file. The font search as it is relies only on family and style name, as I have always found that to be the most reliable way to pick specific font variants. As opposed to trying to make the engine match a font whose name I already know by picking weight values etc. correctly. I used to fight with matplotlib a lot to make it match certain font variants back in the day.

I think most users really just want to select specific fonts and do not need a complicated matching engine. So we don't need to open each file just to read family and style name over and over again. The only thing to work out would be when to invalidate the cache.

jkrumbiegel avatar Dec 23 '22 21:12 jkrumbiegel

I really don't see how we could reuse FONT_CACHE filled at precompile time, since a FTFont holds a pointer (and that is non-serializable). However, we can cache the font paths as regular strings during precompilation, and avoid scanning > 1k font directories at runtime.

t-bltg avatar Dec 23 '22 21:12 t-bltg

Right, it has to be something durable. Cache the choice, not the result. Reading a single font file will be much faster than reading all of them.

timholy avatar Dec 24 '22 10:12 timholy