ciso8601 icon indicating copy to clipboard operation
ciso8601 copied to clipboard

The optimization of speeding up

Open qidanrui opened this issue 1 year ago • 1 comments

Hi @movermeyer , great project! I'm little curious about why your implementation is faster than pendulum? What optimization have you used?

qidanrui avatar Jul 21 '22 17:07 qidanrui

@qidanrui

Disclaimers:

  • I haven't looked at the pendulum code in some time, so I might be forgetting things
  • Unlike pendulum, ciso8601 implements only the most common subset of the ISO 8601 spec. This allows the code to be simpler.

For a long time, ciso8601 was much slower than pendulum when it came to parsing timestamps that contained timezones.

In ciso8601 v2.0, we switched to use the datetime.timezone C objects, newly exposed as part of Python 3.7. This dramatically improved performance for those using ciso8601 with Python 3.7+. This made ciso8601 comparable to, but still slightly slower than, pendulum for timestamps with timezones.

Then we copied the pendulum idea of creating a barebones FixedOffset timezone object. It gave ciso8601 the performance edge over pendulum.

These FixedOffset are much more lightweight (in both memory size and code complexity) than the Python 3.7 datetime.timezone objects and even more so when compared against the pytz objects ciso8601 was using for Python 2.7 compatibility.

We also started caching the timezone objects. This doesn't help the single-parse benchmarks you see in the README (where we disable caching), but it does speed things up considerably in real-world usage.


You can dig more into the performance improvements by checking out the performance label

movermeyer avatar Jul 22 '22 13:07 movermeyer

Closing this as answered.

Going forward, new performance related PRs will continue to get the performance label added to them.

movermeyer avatar Dec 01 '22 04:12 movermeyer