icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

Improving `icu_harfbuzz` compiled data

Open robertbastian opened this issue 1 year ago • 3 comments
trafficstars

The current HB compiled data code is not ideal, because it uses static_to_owned, which might be blocking optimisations. This also let me realise that some compiled constructors that could be const aren't.

robertbastian avatar Feb 01 '24 17:02 robertbastian

Before I review this, were you aware of https://github.com/unicode-org/icu4x/issues/3257, https://github.com/servo/rust-harfbuzz/pull/197, and https://github.com/servo/rust-harfbuzz/pull/245 when writing this PR?

sffc avatar Feb 01 '24 22:02 sffc

I found it after. I do think the changes here are independent of that though, we'll need a compiled data and a data provider path if we want maximum performance.

robertbastian avatar Feb 01 '24 22:02 robertbastian

The PR seems to be based on the claim that the current code "uses static_to_owned, which might be blocking optimisations". I'm fairly concerned about this statement because static_to_owned is clearly documented as being "cheap", and it is a const function. If there is a problem with that function, should we go back to the drawing board? It is designed, I believe, specifically for cases like this where we want to coalesce static and dynamic data down to a single code path.

Coalescing static data into the dynamic codepath inherently blocks optimisations. As soon as a DataPayload gets involved, all the non-static code gets pulled in, and the enum switch inhibits any possible optimisations against the static data. static_to_owned itself is cheap, because creating a static DataPayload is cheap, but we never claimed that it's as efficient as using the fully static codepath.

robertbastian avatar Feb 05 '24 10:02 robertbastian