window_ops
window_ops copied to clipboard
Fast window operations
Window ops
This library is intended to be used as an alternative to
pd.Series.rolling and pd.Series.expanding to gain a speedup by using
numba optimized functions operating on numpy arrays. There are also
online classes for more efficient updates of window statistics.
Install
pip install window-ops
How to use
Transformations
For a transformations n_samples -> n_samples you can use
[seasonal_](rolling|expanding)_(mean|max|min|std) on an array.
Benchmarks
pd.__version__
'1.3.5'
n_samples = 10_000 # array size
window_size = 8 # for rolling operations
season_length = 7 # for seasonal operations
execute_times = 10 # number of times each function will be executed
Average times in milliseconds.
times.applymap('{:.2f}'.format)
| window_ops | pandas | |
|---|---|---|
| rolling_mean | 0.03 | 0.43 |
| rolling_max | 0.14 | 0.57 |
| rolling_min | 0.14 | 0.58 |
| rolling_std | 0.06 | 0.54 |
| expanding_mean | 0.03 | 0.31 |
| expanding_max | 0.05 | 0.76 |
| expanding_min | 0.05 | 0.47 |
| expanding_std | 0.09 | 0.41 |
| seasonal_rolling_mean | 0.05 | 3.89 |
| seasonal_rolling_max | 0.18 | 4.27 |
| seasonal_rolling_min | 0.18 | 3.75 |
| seasonal_rolling_std | 0.08 | 4.38 |
| seasonal_expanding_mean | 0.04 | 3.18 |
| seasonal_expanding_max | 0.06 | 3.29 |
| seasonal_expanding_min | 0.06 | 3.28 |
| seasonal_expanding_std | 0.12 | 3.89 |
speedups = times['pandas'] / times['window_ops']
speedups = speedups.to_frame('times faster')
speedups.applymap('{:.0f}'.format)
| times faster | |
|---|---|
| rolling_mean | 15 |
| rolling_max | 4 |
| rolling_min | 4 |
| rolling_std | 9 |
| expanding_mean | 12 |
| expanding_max | 15 |
| expanding_min | 9 |
| expanding_std | 4 |
| seasonal_rolling_mean | 77 |
| seasonal_rolling_max | 23 |
| seasonal_rolling_min | 21 |
| seasonal_rolling_std | 52 |
| seasonal_expanding_mean | 78 |
| seasonal_expanding_max | 52 |
| seasonal_expanding_min | 51 |
| seasonal_expanding_std | 33 |
Online
If you have an array for which you want to compute a window statistic
and then keep updating it as more samples come in you can use the
classes in the window_ops.online module. They all have a
fit_transform method which take the array and return the
transformations defined above but also have an update method that take
a single value and return the new statistic.
Benchmarks
Average time in milliseconds it takes to transform the array and perform 100 updates.
times.to_frame().applymap('{:.2f}'.format)
| average time (ms) | |
|---|---|
| RollingMean | 0.12 |
| RollingMax | 0.23 |
| RollingMin | 0.22 |
| RollingStd | 0.32 |
| ExpandingMean | 0.10 |
| ExpandingMax | 0.07 |
| ExpandingMin | 0.07 |
| ExpandingStd | 0.17 |
| SeasonalRollingMean | 0.28 |
| SeasonalRollingMax | 0.35 |
| SeasonalRollingMin | 0.38 |
| SeasonalRollingStd | 0.42 |
| SeasonalExpandingMean | 0.17 |
| SeasonalExpandingMax | 0.14 |
| SeasonalExpandingMin | 0.15 |
| SeasonalExpandingStd | 0.23 |