chrono icon indicating copy to clipboard operation
chrono copied to clipboard

Improve performance of `parse_from_rfc3339`

Open sby1ce opened this issue 3 months ago • 4 comments

It would be nice to have performance of DateTime::parse_from_rfc3339 method match performance of speedate::DateTime::parse_str_rfc3339

Naively looking at both implementations the difference seems to come from avoiding bounds checking and unrolling scan::number loops

https://github.com/chronotope/chrono/blob/f3fd15f976c8c8e56d97eda6237af4d485ba2297/src/format/parse.rs#L192-L196

https://github.com/pydantic/speedate/blob/54293aa5a1663148dfad90ccd818ade1d994476c/src/date.rs#L331-L361

        if bytes.len() < 10 {
            return Err(ParseError::TooShort);
        }
        let year: u16;
        let month: u8;
        let day: u8;
        unsafe {
            let y1 = get_digit_unchecked!(bytes, 0, InvalidCharYear) as u16;
            let y2 = get_digit_unchecked!(bytes, 1, InvalidCharYear) as u16;
            let y3 = get_digit_unchecked!(bytes, 2, InvalidCharYear) as u16;
            let y4 = get_digit_unchecked!(bytes, 3, InvalidCharYear) as u16;
            year = y1 * 1000 + y2 * 100 + y3 * 10 + y4;


            match bytes.get_unchecked(4) {
                b'-' => (),
                _ => return Err(ParseError::InvalidCharDateSep),
            }

Perhaps someone with more insight will correct me and actually the difference comes from Parsed struct or something like that. I can write some microbenchmarks if this is something anyone is willing to look into

sby1ce avatar Sep 18 '25 00:09 sby1ce

I'm open to doing code reviews for this -- or contact me for a commercial quote if you want to get this done sooner.

djc avatar Sep 19 '25 04:09 djc

Sure then I'll try my hand at doing this 👍

sby1ce avatar Sep 19 '25 06:09 sby1ce

Actually writing a benchmark it would appear the bottleneck is indeed in the implementation of Parsed as I somewhat suspected.

Here's a sort of MRE https://gist.github.com/sby1ce/bd5cd467982d172dd7d8d52b488a4176

and a benchmark

use std::hint::black_box;

use benchmarks::{using_naive, using_parsed};
use criterion::{Criterion, criterion_group, criterion_main};

const DATETIME: &str = "2025-08-01T00:00:00Z";

fn criterion_benchmark(c: &mut Criterion) {
    let mut group = c.benchmark_group("date parse collect");
    group.bench_function("naive", |b| b.iter(|| using_naive(black_box(DATETIME)).unwrap()));
    group.bench_function("parsed", |b| b.iter(|| using_parsed(black_box(DATETIME)).unwrap()));
    group.finish();
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

The difference is quite drastic, on my machine using_naive is taking ~25 ns and using_parsed - ~80 ns.

I am unsure how to proceed. Is avoiding Parsed an acceptable solution? As I am not familiar with edge cases Parsed may cover, would that even be possible?

sby1ce avatar Sep 20 '25 20:09 sby1ce

I am unsure how to proceed. Is avoiding Parsed an acceptable solution? As I am not familiar with edge cases Parsed may cover, would that even be possible?

I don't have enough context paged in to be able to answer this question with certainty/in detail. I would just try it. We can't be rejecting massively more inputs on the 0.4.x branches unless there's a clear correctness angle, but it's probably okay to use the current test suite as a decent proxy of that.

djc avatar Sep 21 '25 10:09 djc