Port resvg to `harfrust` and `fontations`
Changes made
- Replaces
rustybuzzwithharfrust - Replaces
ttf_parserwithskrifa
Notes
- This is failing 23 tests. These probably need to be manually checked. I'm waiting on https://github.com/linebender/resvg/pull/954 to investigate these.
- This requires an MSRV bump from
1.65to1.80 - This will represent a binary size bump for users of
resvg.
Production profile
The analysis below uses the following "production" profile:
[profile.production]
inherits = "release"
opt-level = 3
debug = false
lto = true
codegen-units = 1
strip = true
incremental = false
Binary size analysis:
According to cargo-bloat (not the most accurate):
harfrustandrustybuzzare both ~250kbttf_parserandread-fontsare both ~100kb- However,
ttf_parserincludes the metrics outlining functionality ofskrifa. And skrifa is another ~250kb.
Compiling the CLI (which is a much more useful test) with the production profile above (cargo build --profile production --bin resvg) I get:
2.50mbformain2.85mbfor this PR
So this PR causes a binary increase of around 350kb
Compile time analysis
Compile times on an M1 Pro (8+2 cores) are shown below (cargo build):
| Profile | main |
This PR |
|---|---|---|
debug |
4.68s | 9.09s |
release |
7.64s | 14.84s |
production |
24.59s | 34.00s |
So this PR roughly doubles compile times at regular opt settings (with the effect being less pronounced with LTO and other production optimisation settings enabled)
Yeah I think we first have to figure out how to deal with fontdb before doing anything with resvg. I don’t think RazrFalcon would want to switch it, and I’m not sure that fontique supports everything we need?
It's an interesting experiment, but I don't see a point in it. Is there a reason in switching to harfrust?
Last time I have checked it was still a beta, while rustybuzz (with all its faults) is a pretty robust solution.
PS: even if harfrust is faster, it doesn't matter either, because shaping isn't a bottleneck, lack of glyphs caching is.
Oh wait, it depends on proc-macros/syn?! Burn it with fire! No proc-macros in resvg.
Last time I have checked it was still a beta, while rustybuzz (with all its faults) is a pretty robust solution.
harfrust is rustybuzz, just with ttf-parser switched out by read-fonts. It passes all the same tests, so it should be no less robust than then rustybuzz.
Is there a reason in switching to harfrust?
The reason is that going forward, (most likely) no one will be continually developing ttf-parser/rustybuzz, while harfrust will be actively maintained.
I would suggest to come back to it in a year. No rush.
And once again, proc-macros are no go. We have to figure out that one as well.
Last time I have checked it was still a beta
Not sure about 'beta' type labels, but the first release (0.1.0) of HarfRust happened on June 10, marking a significant milestone.
It's an interesting experiment, but I don't see a point in it. Is there a reason in switching to
harfrust?
For an immediate effect, it seems to be far more compliant when tested against the HarfBuzz test suite.
To quote Chad:
We’re still mulling over some public API changes and need to do some serious performance work but this is fully backed by fontations, up to date with HarfBuzz 11.2.1, and passes a significantly larger portion of the HB test suite than RustyBuzz: latest CI shows 5911 tests passing for HarfRust vs 2267 for RustyBuzz.
And the HarfRust introduction post by Behdad contains the following claim:
At the time of this writing, HarfRust passes most of the HarfBuzz shaping test suites, failing just 23 tests out of more than 6,000.
Moving forward, HarfRust will receive active development and continuous sync with HarfBuzz, while RustyBuzz is likely to only see critical fixes.
latest CI shows 5911 tests passing for HarfRust vs 2267 for RustyBuzz.
Apples and oranges, but sure.
Either way, proc-macros is the current major roadblock.
@RazrFalcon Seems like the proc-macros are coming from font-types which uses bytemuck and serde derive macros. Do we want to talk to them about converting to manual implementations?
@mattfbacon if it's an easy fix - sure.
No it wouldn't be at all, since the bytemuck derive macros generate unsafe code so they would need to allow unsafe code back into the codebase.
But what is your plan here? Earlier you said wait, but for what?
No it wouldn't be at all, since the bytemuck derive macros generate unsafe code so they would need to allow unsafe code back into the codebase.
I would call it easy. It's not like the code deeply depends on proc-macros.
But what is your plan here? Earlier you said wait, but for what?
Adoption, etc. I don't see a point/reason in switching to 0.1. Especially if we would keep ttf-parser around. It just doesn't make any sense.
There are derive helpers based on macro_rules! rather than proc macros: https://matx.com/research/rules_derive
It is likely possible to implement an alternative derive for bytemuck traits using such helpers. These can then be published as a separate crate such as bytemuck_derive_lite to avoid any potential backward compatibility issues with the proc macro implementation.
Is there a reason in switching to
harfrust?
As some months have passed another benefit has appeared - performance. In the last month the HarfRust vs HarfBuzz OpenType shaping performance difference has gone from ~4x to ~1.2x, which is a massive boost. Not quite on par with HarfBuzz yet, but getting a lot closer. More importantly for this PR here though, HarfRust is now a lot faster than RustyBuzz.
The geometric mean of the last result is ~1.16, meaning Harfrust is now only 16% slower than Harfbuzz.
The geometric mean of the last result is ~1.16, meaning Harfrust is now only 16% slower than Harfbuzz.
It's even better than that. See the comment: https://github.com/harfbuzz/harfrust/discussions/257#discussioncomment-14363176
In short, there's some hb-harfrust overhead included in the spreadsheet number.
Diff for the colv1 test https://www.diffchecker.com/V8j95jlx with some manual fixups (removing groups - although it could also be those causing the differences?). I'm seeing some negations of y-coordinates (which possibly cancel each other out?), some rounding differences, and also that skrifa is applying some opacities to groups whereas ttf-parser applies them directly to the paths.
As long as the COLRv1 test looks visually the same, it should be fine.
@LaurenzV check out the visual diff on github. It's not quite the same. It's subtle, but one is "bolder" than the other (for lack of a better term). The diff produced by rhe test runner looks like the outline of each part of each glyph.
It's possible the difference stems from the fact that the opacities are applied differently. When applying directly to a path, it will simply draw a path with that, while when applying to a group resvg will do layer isolation and apply the opacity that way instead, so you could try if this can be fixed by applying it directly.
But if I remember COLRv1 might actually define opacities in terms of layer, in which case the previous behavior was simply incorrect. In any case, it still looks about the same so I wouldn't be too worried.
Ok, I've managed to fix the CBDT offset issue (I admit more by trial-and-error than by principle), and I've also managed to fix to fix the COLRv0 missing glyph parts (this was a silly bug where string was being thrown away rather than persisted).
That leaves:
The COLRv1 and Zalgo tests. Diff output from the test runner for these tests is as below:
For the COLRv1 tests, see also the diff in the generated "lowered" SVG output (https://www.diffchecker.com/V8j95jlx)
@dfrg Do you have an opinion on whether these represent a genuine bug, a fixing of a bug, or a valid difference in interpretation?
Y-coordinate negations
There seems to be negated output for each y-coordinate in a path, which is then cancelled out by a negated y coordinate in a transform matrix. This is probably fine? But also, maybe we should fix it?
Missing path segments
Some of the text-related usvg "resave" tests (parse SVG then rereserialize back to SVG) are failing. Some sample diffs:
- text_simple https://www.diffchecker.com/e2SQWqnm/
- complex_text https://www.diffchecker.com/tQk27XmG/
- preserve_text_simple_case https://www.diffchecker.com/Ip5F1whl/
- text_with_generated_gradients https://www.diffchecker.com/gYS1sYk2
- clip_path_with_text https://www.diffchecker.com/tDIazl6O/
The issue seems to be that the last segment of some (but not all) paths is being dropped in the Skrifa implementation. Although this seems to make no visual difference to the rendered SVG (see github diff), so perhaps that's also fine?
As long as they visually appear the same, it’s fine. It’s not expected that we achieve pixel-level accuracy.
There seems to be negated output for each y-coordinate in a path, which is then cancelled out by a negated y coordinate in a transform matrix. This is probably fine? But also, maybe we should fix it?
These are equivalent but negating the y scale in the matrix rather than the y coords will generate smaller SVGs so I imagine that would be preferred.
The issue seems to be that the last segment of some (but not all) paths is being dropped in the Skrifa implementation. Although this seems to make no visual difference to the rendered SVG (see github diff), so perhaps that's also fine?
Skrifa omits the final line segment if it matches the beginning of the subpath because this is effectively the same as the close command.
I’m not familiar enough with SVG to understand the differences between group and path opacity so I can’t speak to those diffs. The output of both looks reasonable to me.
I’m not familiar enough with SVG to understand the differences between group and path opacity so I can’t speak to those diffs.
https://razrfalcon.github.io/notes-on-svg-parsing/isolated-groups.html
As some months have passed another benefit has appeared - performance.
Good. It would not affect resvg in any way, because shaping was never a bottleneck, but still a progress.
And rustybuzz is dead anyways.
Harfrust 0.3.2 is now out.