tobac icon indicating copy to clipboard operation
tobac copied to clipboard

Bulk statistics very slow for non-contiguous array data

Open w-k-jones opened this issue 11 months ago • 0 comments

I've recently noticed that bulk statistics can run very slowly when applied to data that is non-contiguous. This can happen when slicing dask arrays or broadcasting along the trailing dimension. Calling ravel on these arrays is ~20x slower, which, as we do this for each feature, adds up to a big slowdown. I might look into smarter ways of doing this in future to address this issue

Using np.split might be a fast approach, as shown in https://stackoverflow.com/a/43094244

w-k-jones avatar Mar 18 '24 09:03 w-k-jones