icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

Improve efficiency of date add/until

Open sffc opened this issue 3 months ago • 9 comments

The specification algorithm is designed to be readable, but it is not fast, because it adds one month and one year at a time. There are two cases that can be optimized:

  1. Adding 1000 months to a date requires running a loop for each year, and adding the number of days in each year until the month is in-range. For calendars with a fixed number of months per year, this is trivial to fast-path. For calendars with a variable number of months per year, this can likely still be fast-pathed, but more care needs to be taken.
  2. Adding 1000 days to a date requires running a loop for each month, and adding the number of days in each month until the day is in-range. No calendar has a fixed number of days per month. But, a fast-path could be to add a years-worth of days all at once, and then to add multiple years at once, being careful about leap years, of course.

sffc avatar Oct 13 '25 22:10 sffc

Adding 1000 days to a date requires running a loop for each month, and adding the number of days in each month until the day is in-range. No calendar has a fixed number of days per month. But, a fast-path could be to add a years-worth of days all at once, and then to add multiple years at once, being careful about leap years, of course.

It could be cheaper to convert to RD, add 1000, and convert back.

robertbastian avatar Oct 14 '25 13:10 robertbastian

Yeah, that makes sense.

Adding years will always be cheap. When it comes to adding months, you have:

  • For solar calendars this is entirely computable using a formula
  • For Hebrew this is also computable with a formula
  • For Hijri this is computable with a formula if you are outside of UAQ/simulated range. So you use the regular algorithm to get to the range edge and then apply a formula
  • For Chinese this is approximatable by applying a mean synodic month length, and you can get the exact answer by then using the yearinfo for that year.

And for adding weeks/days we can just manipulate RDs.

At the moment, testing in V8, adding month/day values that keep us within Temporal range is not noticeably slow.

Manishearth avatar Oct 14 '25 15:10 Manishearth

can you assign me this @sffc

tanishqshrivas avatar Oct 27 '25 14:10 tanishqshrivas

At the moment, testing in V8, adding month/day values that keep us within Temporal range is not noticeably slow.

I was wrong, there are some values which are super slow

https://github.com/unicode-org/icu4x/pull/7206#issuecomment-3474508829

I think the code does not know in advance that this will be out of range.

Manishearth avatar Oct 31 '25 19:10 Manishearth

I'd like to have more thorough tests in place before landing algorithm changes. #7215

sffc avatar Nov 04 '25 00:11 sffc

I think the code does not know in advance that this will be out of range.

Let's fix this as part of #7076. If the DateDuration has a year number that will definitely exceed the pan-calendar year limit, then we can exit early.

sffc avatar Nov 04 '25 01:11 sffc

Sounds reasonable!

Manishearth avatar Nov 04 '25 20:11 Manishearth

This is also what I ended up doing in temporal_rs: predicting when a value will definitely produce something out of range https://github.com/boa-dev/temporal/pull/615

Manishearth avatar Nov 04 '25 20:11 Manishearth

This program using temporal_rs appears to hang indefinitely (although a debugger shows that it's running the loop in https://github.com/unicode-org/icu4x/blob/main/components/calendar/src/calendar_arithmetic.rs#L724 ):

use temporal_rs::{Calendar, PlainDate};
use temporal_rs::options::{DifferenceSettings, Unit};

fn main() {
    let earlier = PlainDate::new(2020, 1, 1, Calendar::GREGORIAN).expect("creating earlier failed");
    let later = PlainDate::new(275760, 1, 1, Calendar::GREGORIAN).expect("creating later failed");
    let mut settings = DifferenceSettings::default();
    settings.largest_unit = Some(Unit::Week);
    let result = earlier.until(&later, settings).expect("until failed");
    println!("{}", result);
}

I think this is related to this bug.

catamorphism avatar Dec 11 '25 00:12 catamorphism

It's interesting that that doesn't hang for days, because temporal_rs handles that case early

https://github.com/boa-dev/temporal/blob/a8ffa8192eee95633064ef756143519329813bfb/src/builtins/core/plain_date.rs#L228-L231

That early codepath could also apply for weeks. But ICU4X could also apply that optimization.

In release builds, year/month diffs seem to take an acceptable amount of time, which honestly surprises me a little bit, since a month diff basically take 1/4th as long as a week diff AFAICT.

Manishearth avatar Dec 11 '25 16:12 Manishearth

Ah. It's because surpasses() calls new_balanced in the day/week case, and new_balanced has a bunch of while loops to balance out the durations. This is quadratic.

https://github.com/unicode-org/icu4x/blob/c46d47f1a81b111892932fd422c10c59a58ae189/components/calendar/src/calendar_arithmetic.rs#L550

https://github.com/unicode-org/icu4x/blob/c46d47f1a81b111892932fd422c10c59a58ae189/components/calendar/src/calendar_arithmetic.rs#L433-L454

Manishearth avatar Dec 11 '25 17:12 Manishearth

https://github.com/unicode-org/icu4x/pull/7308 is my first attempt.

It doesn't fully solve the problem, but it's progress. I can think about the balancing more in a bit.

Another potential optimization is to have a "speeding up" surpasses. I'm thinking we could write a function that, maintains a "delta", which it adds instead of just doing things one at a time, and once it surpasses, it slowly scales the delta back down until it works again.

Manishearth avatar Dec 11 '25 22:12 Manishearth