skforecast icon indicating copy to clipboard operation
skforecast copied to clipboard

ForecasterAutoreg: include lags of exog in skforecast.model_selection.grid_search_forecaster ?

Open zora-no opened this issue 2 years ago • 5 comments

The docs say: "ForecasterAutoreg and ForecasterAutoregCustom allow to include exogenous variables as predictors as long as their future values are known, since they must be included during the predict process." source However, according to the image on the linked website, only the future value of the exogenous variable is included as a predictor, and not its lags. Is there a way to also include the lags of the exogenous variables, not just of the target, in skforecast? I'm aware that this is possible with a fixed number of lags by using ForecasterAutoregCustom(), but I'm wondering if there is a way to incorporate it into skforecast.model_selection.grid_search_forecaster (in order to not use a fixed number of lags)? Thanks a lot!

zora-no avatar May 04 '22 09:05 zora-no

Hi @zora-no Right now, there is no option to create lags of exog variables automatically. However, you could manually convert the exog variable in to a data frame, where each column is a different lag of the exog. Then, using a regressors that allows for feature selection such as ridge, lasso, gradient-boosting in order to identify the more informative lags.

I hope this helps. We will consider implementing this feature in future releases.

JoaquinAmatRodrigo avatar May 06 '22 11:05 JoaquinAmatRodrigo

Hi @JoaquinAmatRodrigo, thanks for your response!

  1. If I create this df, where/ how can I pass it (since it would still be separate from the "normal" Xtrain matrix)?
  2. Is there also no way to do manually include other variables using the custom forecaster? Or to somehow access/manipulate the Xtrain matrix of the forecaster directly?

zora-no avatar May 10 '22 09:05 zora-no

Hello @zora-no,

  1. If I create this df, where/ how can I pass it (since it would still be separate from the "normal" Xtrain matrix)?

To add exogenous variable to the train matrix you should declare them inside the fit()method, for example:


# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterAutoreg(
                    regressor = RandomForestRegressor(random_state=123),
                    lags      = 15
            )

forecaster.fit(
    y    = data_train['y'],
    exog = data_train[['exog_1_lag1', 'exog_1_lag2', 'exog2_lag0', 'exog_3_lag5']]
)

Note: data_train is a dataframe with the serie, y, and the exogenous variables (one for each lag, you can use pandas.DataFrame.shift() method to create the lags: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.shift.html)

You can find a good explanation about exogenous variables in skforecast here:

https://joaquinamatrodrigo.github.io/skforecast/latest/notebooks/autoregresive-forecaster-exogenous.html

  1. Is there also no way to do manually include other variables using the custom forecaster? Or to somehow access/manipulate the Xtrain matrix of the forecaster directly?

Imagine the next example:


# Custom function to create predictors
# ==============================================================================
def create_predictors(y):
    '''
    Create first 10 lags of a time series.
    Calculate moving average with window 20.
    '''

    lags = y[-1:-11:-1]
    mean = np.mean(y[-20:])
    predictors = np.hstack([lags, mean])

    return predictors

# Create forecaster
# ==============================================================================
forecaster = ForecasterAutoregCustom(
                    regressor      = RandomForestRegressor(random_state=123),
                    fun_predictors = create_predictors,
                    window_size    = 20
             )

The function create_predictors(y) only can have as an argument the series y, to transform the exogenous variables you should do it outside the forecaster and pass them as the previous example in the fit() method.

More info:

https://joaquinamatrodrigo.github.io/skforecast/latest/notebooks/custom-predictors.html

JavierEscobarOrtiz avatar May 12 '22 08:05 JavierEscobarOrtiz

Hi ! First of all thanks for this nice library. Just to say that I would also be interested by this feature.

It would be nice to consider lags for exogenous variable bu not only for grid_search_forecaster but also for ForecasterAutoreg

I've implemented a sloution to compute lags for exogenous data before the fit. One of the issue is that these lags are not stored in the ForecasterAutoreg.last_window so it force to overwritte it to be able to make a prediction.

Bellow a snapshot of my solution:

class Forecaster:
    def __init__(
        self,
        regressor=,
        y_lags: Union[int, np.ndarray, list]= 1,
        exog_lags: Union[dict[str, Union[int, np.ndarray, list]],int, np.ndarray, list] = 0,  # For dict, keys is an exog name, values are its associated lags
    ):
   ...

Where Forecaster is a wrapper of your ForecasterAutoreg

Thanks again

tkaraouzene avatar Jul 13 '22 18:07 tkaraouzene

Hi !

To complete my previous post, If exog_lags are computed, they also have to be stored for the predict.

bellow a suggestion:

[original code]:

def fit(...):
    ...
    # The last time window of training data is stored so that lags needed as
    # predictors in the first iteration of `predict()` can be calculated.
    self.last_window = y_train.iloc[-self.max_lag:].copy()

[modified code]:

def fit(...):
    ...
    # The last time window of training data is stored so that lags needed as
    # predictors in the first iteration of `predict()` can be calculated.
    self.y_last_window = y_train.iloc[-self.max_lag:].copy()
    self.exog_last_window = X_train.iloc[-self.max_lag:].copy()

tkaraouzene avatar Jul 15 '22 07:07 tkaraouzene

Hi, It has been more than 60 days since the last activity on this GitHub issue. Since we have not received any updates or progress reports, we will be closing this issue.

If this issue is still relevant and requires attention, please feel free to reopen it and provide an update. We appreciate your contributions and would love to see this issue resolved if it is still relevant.

Thank you for your participation and cooperation!

JoaquinAmatRodrigo avatar Apr 05 '23 10:04 JoaquinAmatRodrigo