icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

Reduce monomorphization in Date::try_from_fields and added size measue

Open SATVIKsynopsis opened this issue 1 month ago • 26 comments

Fixes #7201

This PR refactors the resolve_fields_non_generic() logic in calendar_arithmetic.rs to reduce monomorphization and improve binary size when using Date::<AnyCalendar>::try_from_fields().

It also adds a new example (date_try_from_fields.rs) to measure before and then after binary size differences, as requested.

Binary changes (original file) size comparison:

$ cargo build --release --example date_try_from_fields $ ls -lh target/release/examples/date_try_from_fields -rwxr-xr-x 2 kiit kiit 588K Dec 6 17:17 date_try_from_fields

$ strip target/release/examples/date_try_from_fields $ ls -lh target/release/examples/date_try_from_fields -rwxr-xr-x 2 kiit kiit 460K Dec 6 17:18 date_try_from_fields

image

After changes size comparison:

$ cargo build --release --example date_try_from_fields $ ls -lh target/release/examples/date_try_from_fields -rwxr-xr-x 2 kiit kiit 576K Dec 6 17:20 date_try_from_fields

$ strip target/release/examples/date_try_from_fields $ ls -lh target/release/examples/date_try_from_fields -rwxr-xr-x 2 kiit kiit 452K Dec 6 17:21 date_try_from_fields

image

This demonstrates a measurable reduction in monomorphized code size.

The stripped example binary decreased from 460 KB (before) to 452 KB (after), a net savings of ~8 KB (~1.7%).

This reduction comes from refactoring resolve_fields_non_generic() to avoid monomorphizing calendar-specific logic into generic code paths.

SATVIKsynopsis avatar Dec 06 '25 12:12 SATVIKsynopsis

Thanks for running benchmarks, the size reduction looks promising! However, I don't think it makes sense to review this as long as it changes behaviour. Please make the tests pass.

robertbastian avatar Dec 08 '25 14:12 robertbastian

I have pushed a fix for the CI failures from the last commit.

Previously, I achieved an 8 kB reduction, but that code failed the tests because I had simplified the year/era logic too much. I realized I removed some necessary calculations that specific calendars need.

I have fixed that now. Here is what I did in this new version:

I moved all the data preparation into a prepare helper function.

I moved all the final checks and error handling into a finalize helper function.

The main function now just sits in the middle and calls the specific calendar methods using the data from the helpers.

This approach is safe and passes all tests. The size reduction is now about 6 kB (~1.3%). While this is slightly less than the previous 8 kB reduction, the logic is now fully correct and now and the CI issues will pass.

image

The size successfully reduced from the original file which was 460 kb to 454 kb.

SATVIKsynopsis avatar Dec 09 '25 11:12 SATVIKsynopsis

$ cargo build --release --example date_try_from_fields

Could you add this code to the pull request?

sffc avatar Dec 09 '25 16:12 sffc

That file was breaking CI tidy request so I removed that from commit history. I will add it as asked now.

SATVIKsynopsis avatar Dec 09 '25 16:12 SATVIKsynopsis

I have added the example file as asked.

SATVIKsynopsis avatar Dec 09 '25 16:12 SATVIKsynopsis

Thank you. Could you make it use #![no_main] and icu_benchmark_macros as we do in other examples? See here:

https://github.com/unicode-org/icu4x/blob/main/components/icu/examples/iso_date_manipulations.rs

It removes a lot of standard library code from the binary which helps you get better code size readings.

sffc avatar Dec 09 '25 17:12 sffc

I have updated the file to use #![no_main] and the icu_benchmark_macros as you suggested, aligning with the pattern in the iso_date_manipulations.rs example.

SATVIKsynopsis avatar Dec 09 '25 17:12 SATVIKsynopsis

Thanks; can you re-measure the size impact now?

When building, I suggest using --profile release-opt-size

sffc avatar Dec 09 '25 18:12 sffc

I mean, what is the size of the example before and after the change to the try_from_fields code?

I have a suggestion: please open a separate PR for the example, because then we can easily track in the dashboard what is the impact of this PR on the code size.

sffc avatar Dec 09 '25 18:12 sffc

Using --profile release-opt-size, the results are:

Before: ~341 KB

After: ~343 KB

This profile adds its own instrumentation, so the change appears slightly positive. However, under the normal --release profile, the size still improves (460 KB → 454 KB), confirming that the refactor has a net positive impact.

SATVIKsynopsis avatar Dec 09 '25 18:12 SATVIKsynopsis

We've spent time making sure that --profile release-opt-size gives a reliable measurement of code size, better than --release. If you're not seeing an improvement with that profile, it means that more work is needed on the PR.

Please start by making a separate PR for the example that measures code size; there are more changes I'd like to suggest over there. Then, this PR can cleanly evaluate only the library code change. It might take some experimentation to figure out how to reduce the code size.

sffc avatar Dec 09 '25 18:12 sffc

You should run cargo make wasm-wasm-examples for platform-independent size benchmarks. This is what gets tested in CI (https://unicode-org.github.io/icu4x/benchmarks/binsize/wasm/index.html)

robertbastian avatar Dec 09 '25 19:12 robertbastian

Are there any final changes or checks you would like me to perform on this PR as the dependent example file PR is already merged? Thank you.

SATVIKsynopsis avatar Dec 09 '25 23:12 SATVIKsynopsis

Please update the benchmark numbers. The baseline is ~40KB

robertbastian avatar Dec 09 '25 23:12 robertbastian

Thank you for confirming the official baseline of ~40 KB.

I also tested on my local toolchain using a clean build. On my machine, the wasm output reports ~56 KB, which appears to be due to local overhead and differences in the build environment.

However, the structural change in this PR removes approximately 6 KB worth of monomorphized code paths, and the difference shows up consistently in the native binary (--profile release-opt-size).

Given that the official CI environment produces the established 40 KB baseline, the same structural reduction should apply there as well.

SATVIKsynopsis avatar Dec 10 '25 00:12 SATVIKsynopsis

On main:

$ cargo make wasm-wasm-examples
$ ls -l wasmpkg/date_try_from_fields.wasm 
-rwxr-xr-x 1 sffc primarygroup 47052 Dec  9 17:17 wasmpkg/date_try_from_fields.wasm

I tried running on this branch, but you need to remove the old example from this branch and merge in the new example from main.

sffc avatar Dec 10 '25 01:12 sffc

I have removed the old example file from this branch as requested, which resolves the build conflict.

The PR now contains only the final, optimized calendar arithmetic file.

SATVIKsynopsis avatar Dec 10 '25 02:12 SATVIKsynopsis

I checked out your branch and merged in main locally to get a measurement.

$ cargo make wasm-wasm-examples
$ ls -l wasmpkg/date_try_from_fields.wasm 
-rwxr-xr-x 1 sffc primarygroup 59394 Dec  9 19:38 wasmpkg/date_try_from_fields.wasm

So there is a 26% size increase.

Now that you have a specific thing to measure against, please continue experimenting until you get the file size down.

sffc avatar Dec 10 '25 03:12 sffc

Thank you for the detailed feedback.

To ensure I am debugging against the correct target, could you please clarify the size baseline for the date_try_from_fields wasm example on the current state of the main branch?

Thank you.

SATVIKsynopsis avatar Dec 10 '25 04:12 SATVIKsynopsis

I posted it a few posts above, but you should be able to reproduce it yourself, which you need to do in order to check if your changes improve it or not

sffc avatar Dec 10 '25 04:12 sffc

Thanks for the feedback. Will do the same and introduce the changes soon.

SATVIKsynopsis avatar Dec 10 '25 04:12 SATVIKsynopsis

@SATVIKsynopsis I think you deleted a comment, but finding astronomical calculations in the binary is a very good catch, way more impactful than what you are doing here 😃 See #7301

robertbastian avatar Dec 10 '25 14:12 robertbastian

I had previously inspected the wasm binary with twiggy, and I noticed that many astronomical functions (lunar longitude, solar longitude, lunar phase, etc.) were still present even after the Hijri changes.

Now that HijriAstronomicalSimulation has been removed, should I start exploring similar size reductions for the Chinese Traditional calendar?

specifically: are the chinese astronomy calculations candidates for replacement with tabular / hardcoded data (like Hijri)? if so, what range of years would be acceptable? should I open an exploratory pr or proposal first before writing changes?

I want to make progress in the right direction and avoid working on parts that won’t meaningfully reduce code size, so guidance would be appreciated.

SATVIKsynopsis avatar Dec 10 '25 14:12 SATVIKsynopsis

are the chinese astronomy calculations candidates for replacement with tabular / hardcoded data (like Hijri)? if so, what range of years would be acceptable?

we've already done this, with #7301 there isn't any astronomical code in the binary anymore

robertbastian avatar Dec 10 '25 14:12 robertbastian

Thanks for the update Since the Hijri astronomical simulation has already been removed in #7301, I want to help reduce the WASM size further.

Are there any specific calendars still known to contribute significantly to WASM size?

For example: • Hebrew lunisolar? • Korean Traditional? • Others with large lookup tables or complex month/era resolution?

If you can point to which calendar would benefit the most from optimization (tabular fallback, data trimming, reducing branching, etc.), I can start working on that area next.

SATVIKsynopsis avatar Dec 10 '25 14:12 SATVIKsynopsis

just a small follow up . whenever you get a chance, could you point me toward which calendar would be the most useful to target next for wasm size reduction?

I tested optimizing only the arithmetic date logic (without touching any of the astronomical calculation code), and that alone reduced the example wasm from ~47 KB(baseline) to ~45 KB (~2 KB improvement).

before I start exploring deeper calendar-specific optimizations, i wanted to make sure i am focusing on the area that will give the best impact, so I will wait for your guidance.

No rush , just checking in. thanks.

SATVIKsynopsis avatar Dec 11 '25 09:12 SATVIKsynopsis

Thanks for the guidance so far.

I have experimented further on this branch and confirmed that pure arithmetic level refactoring (without touching calendar specific data or astronomy) yields a ~2 KB reduction in the wasm example (~47 KB → ~45 KB).

At this point, additional gains likely require calendar specific changes or data layout work rather than generic arithmetic refactors.

I am happy to keep this PR as a draft or exploratory branch, or close it and follow up with a more targeted proposal once there’s agreement on the next optimization area.

Please let me know which you would prefer.

SATVIKsynopsis avatar Dec 17 '25 11:12 SATVIKsynopsis

Those numbers are from before #7301 was merged?

I'll let @sffc decide how to proceed, 2kB isn't nothing, if we can verify that there is no behaviour change I'm open to merging this. We might need additional tests.

robertbastian avatar Dec 17 '25 14:12 robertbastian

the merge of #7301 was a massive win for the binary size. It dropped the baseline all the way down to 19 kb.

previously, on the old 47 KB baseline, my changes shaved off about 2 KB. however, because the binary is now so lean those same optimizations are hitting the hard floor of the standard library overhead. I tried aggressively optimizing further (even stripping error strings), but the size stays stuck at 19 KB.

It looks like 19 kb is the absolute limit we can reach right now. since the main bloat is gone, my arithmetic refactor doesn't have much left to squeeze out.

SATVIKsynopsis avatar Dec 17 '25 18:12 SATVIKsynopsis

thanks for confirming, I will close this then.

robertbastian avatar Dec 17 '25 20:12 robertbastian