featuretools icon indicating copy to clipboard operation
featuretools copied to clipboard

Add Expanding* primitives for time series feature engineering

Open thehomebrewnerd opened this issue 1 year ago • 0 comments

Add Expanding* versions of the Rolling* primitives for time series feature engineering.

The current rolling primitives, such as rolling max, use data contained within a window to determine the value along with a gap, which allows them to be used to create features from a target column without including recent values. We could add corresponding Expanding* primitives that include a gap parameter that uses all of the previous history excluding the gap rows to calculate the values. This could be implemented as new primitives or by including an optional gap parameter in the existing Cumulative* primitives or by updating the Rolling* primitives to use all history prior to the gap rows. The fact that the current Cumulative* primitives do not have a gap option prevents them from being used to create time series features from a target column as they would leak the target value at each time step. Similarly, since the current Rolling* primitives don't have an option to use all history they cannot be used for this purpose.

See technique 5 in this article for more info: https://www.analyticsvidhya.com/blog/2019/12/6-powerful-feature-engineering-techniques-time-series/

thehomebrewnerd avatar Aug 10 '22 21:08 thehomebrewnerd