python-rasterstats icon indicating copy to clipboard operation
python-rasterstats copied to clipboard

Multiband support

Open perrygeo opened this issue 9 years ago • 5 comments

zonalstats was originally optimized for large vector feature collections against a single raster band (i.e. thousands of polygons against a DEM raster)

There is another use case which zonalstats doesn't handle so well: many rasters/bands against a single geometry (i.e. the value at a single point for thousands of climate model simulations or a multispectral image). You can loop through and call zonal stats on each raster/band but this incurs the overhead of the rasterio.open each time. A more efficient mechanism might be to stack all the data into a multi-band raster and open it once, slice the 3D array and don't apply the aggregate functions to the 3rd dim - return an array rather than a scalar statistic.

I'll have to do some research to see how much refactoring/code clutter this would cause. Otherwise, it might just be a new zonalstats_multiband function.

perrygeo avatar Aug 14 '15 22:08 perrygeo

Work started in multiband branch

Still need:

  • [ ] bands for arrays (right now we assume 2D)
  • [ ] check for 2 or 3D (4D not supported)
  • [ ] deprecate, document the band_num to band change
  • [ ] categorical by band?
  • [ ] nodata by band?
  • [ ] additional stats - apply by band or let the function implement the handling or bands?
  • [ ] when all bands specified, read raster meta instead of assuming [1, 2, 3]

perrygeo avatar Oct 07 '15 16:10 perrygeo

Bummer

array.mean(axis=(-1, -2))  # Works; the mean of the last two axes
masked.mean(axis=(-1, -2))  # TypeError: tuple indices must be integers, not tuple

perrygeo avatar Oct 12 '15 15:10 perrygeo

There is an open PR (https://github.com/numpy/numpy/pull/5706) that serves as a proof of concept fix. This would need to be applied to all the functions used by the stats:

min, max, mean, count, std, sum, median, unqiue, percentile

Maybe looping through the bands is the only viable approach for now - even if we could make those changes and get them merged, it will be a while before they make it into mainstream releases.

perrygeo avatar Oct 13 '15 11:10 perrygeo

+1 for reopening this one.

jonschwenk avatar Jun 20 '22 19:06 jonschwenk

For my use case it would be crucial to have some multiband feature. For instance, I would like to perform stand based statistics on vector values, e.g. pixel brightness (combination of RGB values), pixel hue, etc. These cannot be independently calculated by bands and combined later, the transformation must be before aggregation.

(Returning vector values would be also nice, but this could be worked around, calculating vector, and return selected vector component.)

dvolgyes avatar Jul 22 '22 13:07 dvolgyes