icu4x
icu4x copied to clipboard
Reduce Dart binary size
The ICU4X shared library with full compiled data and all features currently measures around 30MB on Linux. For Dart we need to greatly reduce this in order to be usable.
How does ICU4X deal with this in other languages
ICU4X's API is designed around many small functions, so that the compiler can aggresively optimise. In Rust, the compiler has a view of the whole program, so it is in a position to throw out code (and data). This also holds for C/C++ when doing static linking, where the C compiler compiles C code against our static library with a whole-program view.
Why does this not work in Dart
Dart only supports dynamic linking (https://github.com/dart-lang/sdk/issues/49418), i.e. the library is loaded into memory at runtime, and the system helps the Dart binary find the required functions. This means, however, that the shared library is compiled independently of the Dart binary, and cannot be compile-time optimised.
Approaches for reducing the binary size in Dart
Static linking
The simplest, and probably most performant, solution would be for Dart to support static linking. However, there's currently no concrete plan for this on the Dart side (https://github.com/dart-lang/sdk/issues/49418).
Tree shaking
Conceptually, Dart already uses something like static linking. We do not dynamically link against a system (or shared) library, instead we need to ship our own library inside Dart's asset system (https://github.com/dart-lang/sdk/issues/54003). The Dart compiler is aware of all ICU4X functions that are reachable in the compiled binary (through the @ResourceIdentifier which we added in https://github.com/rust-diplomat/diplomat/pull/442), so it could remove unreachable symbols from the dynamic library. There is currently a work-in-progress custom link.dart script, which gets invoked during the compilation and has access to the list of @ResourceIdentifier. We can use this to tree-shake our shared library to a minimal shared library.
Filtering a shared library
Shared libraries are platform specific executable files, which very little metdata beyond a symbols table in the shape that a dynamic linker can understand, and code (i.e. for Linux these will be ELF files). We were not able to find any tools that can filter a dynamic library in the way we require.
Creating a minimal shared library from a static library
We do have access to a native C toolchain for each Dart target through the native_assets_cli package. This means we can use a two-step compilation process as follows:
- We build ICU4X into a static library (
.a) using the Rust compiler - This artifact gets shipped to the Dart dev through the mechanism in #4689
- In the
link.dartstep, we link the static library into a dynamic library, including only the required symbols. For example, theclanglinker can be invoked as follows:
// symbols.lds
{
global:
ICU4XFixedDecimal_create_from_i32;
local:
*;
};
$ clang -fPIC -shared -u ICU4XFixedDecimal_create_from_i32 -Wl,--version-script=symbols.lds \
-Wl,--gc-sections -Wl,-strip-debug -o out.so <static-lib>
In experiments this reduces the binary size to e.g. ~1.7MB for collation (including data).
Open questions
So far we have tested this approach on Linux. We will need to confirm that this is feasible for all Dart platforms.
Data size
While the shared library tree shaking is able to reduce code size by removing unused functionality, it is not able to remove unused locales. ICU4X by default builds with around 200 locales in "compiled data" mode, which make up a large chunk of the binary size.
Custom compiled data
The most performant approach to custom data in ICU4X is custom compiled data. This uses icu_datagen to generate Rust code, which is then used during the build of the ICU4X binary. However, as we lack the ability to build the ICU4X library during the Dart build, we cannot use this approach in the general case. We could generate binaries with different sets of locales, but this would lead to a combinatorial explosion of dart platform x locale sets, and its unclear which locale sets we should support.
Serialized data
The more flexible approach to custom data is to load serialised data blobs at runtime. Our deserialisation is zero-copy (no allocation, only validation), so there's no significant performance impact. It does however let us generate data and binaries separately.
In this approach we will generate the static library with only a small subset of universally required compiled data (such as fallback data), and everything else will be provided by serialised data. We can generate the required blob of serialised data in the link.dart phase, as we have a list of used functions, which we can map to required data (https://github.com/unicode-org/icu4x/issues/2685). This will be done by a precompiled Rust binary (https://github.com/unicode-org/icu4x/pull/4347), which we ship for each host platform. The binary will include the complete precomputed data, in order to not have to generate data from first principles (CLDR), but only to filter out unselected locales.
We then use Dart's assets-functionality to package the serialised blob into the Dart binary, and access it at runtime.
Open questions
In order to generate custom locale data, we need some way for the client to select the desired list of locales, which we can consume in link.dart.
- @mosuem can you link an issue for this?
Next steps
- [ ] Validate feasibility of static-to-shared building using the Dart toolchain for all platforms
- [ ] Host
icu_datagen_dartbinaries - [ ] Call
icu_datagen_dartinlink.dart- Likely blocked on https://github.com/unicode-org/icu4x/issues/2685, however static analysis of the filtered shared library might be a solution for this
- Needs Dart locale selection mechanism
This will be done by a precompiled Rust binary (https://github.com/unicode-org/icu4x/pull/4347), which we ship for each host platform.
This will also have to be distributed via the CDN together with the ICU4X binaries, see #4689. So there, we will probably want to distributed (zipped together?) a compiled icu4x, compiled icu_datagen, and a full data blob.
In order to generate custom locale data, we need some way for the client to select the desired list of locales, which we can consume in link.dart.
Ideally, this would be provided using the same @ResourceIdentifier mechanism, as part of the API of package:intl4x. I opened an issue here.
#2685
Discussion with @Manishearth @sffc @mosuem @robertbastian
There will be two compilation modes: with and without Rust
- Without Rust
- either downloads the library from GitHub
- or points to a local precompiled library (mainly useful for debugging)
native_toolchain_cwill provide a linker, which we will use inlink.dartto convert the static library into a tree shaken dynamic library- https://github.com/dart-lang/native/pull/987
- For Flutter builds, target platform linkers are available
- Otherwise, it will error if it can't find a linker for the target
- There is no cross-compilation in Dart (without Flutter), so the target linker will be the host linker
- The symbol filtering logic can be added to
native_toolchain_c, as a way to encapsulate platform-specific linker invocations native_toolchain_cdoesn't guarantee LLVM, so we cannot use LTO in the general case- stretch: we can ship
libicu_capi.a,libicu_capi.llvm17.a,libicu_capi.llvm16.a,libicu_capi.llvm15.a, ... and do LTO if we happen to have a matching LLVM version
- With Rust
- fetches the correct release from crates.io
- or point to a manual checkout of
unicode-org/icu4x(useful for debugging or with custom package management) - uses the full Rust toolchain to create a treeshaken dynamic library
- this uses the
--exportwrapper thing we currently do in thejs-tinytutorial - sounds like it should generally work according to https://github.com/rust-lang/rust/issues/73958
- might not work on Windows according to that issue
- this uses the