iris icon indicating copy to clipboard operation
iris copied to clipboard

Extracting a time range from a cube is slow

Open bouweandela opened this issue 1 year ago • 5 comments

📰 Custom Issue

Extracting a time range as described in the documentation is quite slow if you want to do it for many cubes and/or cubes with many time points. For a single cube with 10000 time points it already takes 2 seconds on my computer, so if I want to subset a few hundred cubes that becomes quite slow.

Here is a script that demonstrates this:

import cf_units
import iris.cube
import iris.coords
import iris.time
import numpy as np

time_units = cf_units.Unit('days since 1850-01-01', calendar='standard')
time = iris.coords.DimCoord(np.arange(10000, dtype=np.float64), standard_name='time', units=time_units)
cube = iris.cube.Cube(np.arange(10000, dtype=np.float32))
cube.add_dim_coord(time, 0)
pdt1 = iris.time.PartialDateTime(year=1852)
pdt2 = iris.time.PartialDateTime(year=1854)
constraint = iris.Constraint(time=lambda cell: pdt1 <= cell.point < pdt2)

%timeit cube.extract(constraint)

Result:

1.83 s ± 28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

From looking at the code in iris.coords, it looks like the slow behaviour is caused by converting all time points to datetimes individually for each cell, instead of converting them once and then generating the cells.

Here is some code with timings:

%timeit time.units.num2date(time.points)
27.3 ms ± 3.18 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

and

%timeit list(time.units.num2date(p) for p in time.points)
1.53 s ± 29.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

If this is an interesting feature, I can make a pull request to change the code so it first converts all the time points and then generates the cells?

bouweandela avatar Sep 09 '22 15:09 bouweandela

This was previously raised at #3609, which went stale. So I think this is a desirable feature that no-one got around to addressing yet.

rcomer avatar Sep 10 '22 09:09 rcomer

Fancy taking it on @rcomer ? :wink:

bjlittle avatar Sep 14 '22 10:09 bjlittle

I think @bouweandela was offering to put something up, and has clearly already given it more thought than I have!

rcomer avatar Sep 14 '22 12:09 rcomer

Yes, I already tried to implement something. I'll open a pull request and we can see from there..

bouweandela avatar Sep 16 '22 12:09 bouweandela

Just opened a pull request here: https://github.com/SciTools/iris/pull/4969

bouweandela avatar Sep 16 '22 13:09 bouweandela