ttfunk
ttfunk copied to clipboard
Reduce array allocations when loading a font
Hi!
I was investigating performance of PDF generation and I found that the creation of these [left, right]
arrays where the largest source of object allocations in my flow. I found a way to eliminate the need for those arrays while maintaining compatibility.
Are you interested in a patch like this? If so, I'm happy to clean it up and make sure it's properly tested. Let me know what you think.
@rmosolgo Can you please provide some benchmarks so that I could have at least some idea of what kind of improvement we're talking about?
Hi, sure thing, thanks for taking a look! Here's a benchmark I wrote to demonstrate the impact:
require "bundler/inline"
gemfile do
gem "prawn"
gem "matrix"
gem "benchmark-ips"
gem "memory_profiler"
# To test local changes:
# gem "ttfunk", path: "./"
end
def load_font
prawn_doc = Prawn::Document.new
prawn_doc.font_families.update(
"OpenSans" => {
normal: "/Users/rmosolgo/code/aqualytics/vendor/open_sans/OpenSans-Regular.ttf",
bold: "/Users/rmosolgo/code/aqualytics/vendor/open_sans/OpenSans-Bold.ttf",
italic: "/Users/rmosolgo/code/aqualytics/vendor/open_sans/OpenSans-Italic.ttf",
bold_italic: "/Users/rmosolgo/code/aqualytics/vendor/open_sans/OpenSans-BoldItalic.ttf",
semibold: "/Users/rmosolgo/code/aqualytics/vendor/open_sans/OpenSans-Semibold.ttf",
}
)
prawn_doc.font("OpenSans")
end
Benchmark.ips do |x|
x.report("load_font") { load_font }
end
puts "\n\n"
report = MemoryProfiler.report do
load_font
end
report.pretty_print
Here's the before/after diff:
Calculating -------------------------------------
- load_font 96.964 (± 4.1%) i/s - 486.000 in 5.019886s
+ load_font 154.372 (± 6.5%) i/s - 770.000 in 5.010979s
- Total allocated: 2929728 bytes (19026 objects)
+ Total allocated: 2036720 bytes (817 objects)
Total retained: 0 bytes (0 objects)
So, that's more than 50% faster and almost 96% fewer Ruby object allocations (and 30% less memory usage).
Looking down the MemoryProfiler output, it wasn't too hard to spot this area as the hotspot:
allocated objects by location
-----------------------------------
- 18694 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/ttfunk-1.7.0/lib/ttfunk/table/kern/format0.rb:27
+ 490 /Users/rmosolgo/code/ttfunk/lib/ttfunk/table/kern/format0.rb:19
58 <internal:pack>:21
38 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/ttfunk-1.7.0/lib/ttfunk/directory.rb:19
Nice. Very nice, actually. Though, I'm a little concerned that this changes public API. I don't think the current API is very good or maps well on the underlying format but it is our public API.
I wonder if we can get similar results by lazily populating pairs instead of parsing the table completely.
Yeah, that sounds good -- looking into the prawn
source, it seems like pairs
is only used when options[:kerning]
is used by the application. So if we skipped initializing the table, and instead, implemented the pairs
hash to create entries on-demand, we might never make the table to begin with. Is that kinda what you have in mind?
Yeah, pretty much, but also even when kerning is used we might not need much of the kern
table. Maybe binary search can be faster for most use cases. After all the table was designed with it in mind.
this looks like a great idea.
Would it make sense to introduce pairs
to keep the interface?
It is stated that pairs
is a reader. Is pairs
read-only?
def pairs
@pairs.each_with_object({}) do |(left, right_hash), out|
right_hash.each do |right, value|
out[[left, right]] = value
end
end
end
Or I guess you could introduce a proxy object (back of envelope code)
class PairProxy
def []=(keys, value)
@pairs[keys.first][keys.last] = value
end
def [](keys)
@pairs[keys.first][keys.last]
end
end
def pairs
PairProxy.new(@pairs)
end