iris icon indicating copy to clipboard operation
iris copied to clipboard

Speed up operations that use the `Coord.cells` method for time coordinates

Open bouweandela opened this issue 3 years ago • 1 comments

🚀 Pull Request

Description

This speeds up functions that make use of the Coord.cells method to generate many cells describing a time coordinate. This affects for example Cube.extract, Cube.subset, and Coord.intersection.

Here is a script that demonstrates this:

import cf_units
import iris.cube
import iris.coords
import iris.time
import numpy as np

time_units = cf_units.Unit('days since 1850-01-01', calendar='standard')
time = iris.coords.DimCoord(np.arange(10000, dtype=np.float64), standard_name='time', units=time_units)
cube = iris.cube.Cube(np.arange(10000, dtype=np.float32))
cube.add_dim_coord(time, 0)
pdt1 = iris.time.PartialDateTime(year=1852)
pdt2 = iris.time.PartialDateTime(year=1854)
constraint = iris.Constraint(time=lambda cell: pdt1 <= cell.point < pdt2)

%timeit cube.extract(constraint)

Before:

1.4 s ± 15 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

After:

36.7 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Note that the changes in this pull request do slow down the case where only a few cells are actually generated:

Before:

%timeit time.cells()
1.24 µs ± 20.6 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
%timeit next(time.cells())
140 µs ± 1.07 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

After:

%timeit time.cells()
216 ns ± 7.68 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
%timeit next(time.cells())
16.8 ms ± 395 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

but you can still use Coord.cell if you need just a few cells.

Closes #4957.


Consult Iris pull request check list

bouweandela avatar Sep 16 '22 12:09 bouweandela

I ran a few more experiments, and this implementation is faster if you generate more than roughly 100 cells.

bouweandela avatar Sep 16 '22 14:09 bouweandela

Thanks! I added a what's new entry.

bouweandela avatar Sep 29 '22 12:09 bouweandela

I've just been manually re-running some benchmarks, congratulations!

       before           after         ratio
     [f69b93f2]       [27422111]
-        42.2±7ms       34.7±0.9ms     0.82  load.TimeConstraint.time_time_constraint(20, 'NetCDF')

trexfeathers avatar Oct 04 '22 11:10 trexfeathers