skforecast icon indicating copy to clipboard operation
skforecast copied to clipboard

ngboost

Open amiroft opened this issue 1 year ago • 9 comments

is it possible to use ngboost as a regressor for forecaster? if yes how should we define it?

amiroft avatar Jun 13 '23 09:06 amiroft

Hi @amiroft, Yes, you can use ngboost as regressor since it has a .predict method. If you are using ngboost you may be interested in estimating the whole distribution for each forecasted step. If this is the case, you can not use the pred_dist method of ngboost, however, you may use the predict_dist from skforecast (https://skforecast.org/0.8.1/user_guides/probabilistic-forecasting.html#predicting-distribution-and-intervals-using-bootstrapped-residuals)

JoaquinAmatRodrigo avatar Jun 13 '23 10:06 JoaquinAmatRodrigo

that's good but ,when I tried

forecaster = ForecasterAutoreg( regressor=NGBRegressor(**hyperparameters), lags=[1, 24], ) forecaster.fit(data_train['Target0'], exog=data_train[exog]) boot_predictions = forecaster.predict_bootstrapping(exog = data_val[exog],steps=steps, n_boot=n_boot)

it returns back `/usr/local/lib/python3.10/dist-packages/skforecast/ForecasterAutoregDirect/ForecasterAutoregDirect.py in fit(self, y, exog, store_in_sample_residuals) 660 ) 661 else: --> 662 self.regressors_[step].fit( 663 X = X_train_step, 664 y = y_train_step,

TypeError: NGBoost.fit() got an unexpected keyword argument 'y'`

amiroft avatar Jun 13 '23 11:06 amiroft

I see, ngboost, does not use the same argument naming as the one defined by scikit-learn (y vs Y). Unfortunately, skforecast only allows regressors that follow the skcikitlearn API.

JoaquinAmatRodrigo avatar Jun 13 '23 11:06 JoaquinAmatRodrigo

yes, How can we have it in skforecast? should we change the API of that so that it would be compatible with skforecast? do you have any suggestions or guidance on it?

amiroft avatar Jun 13 '23 12:06 amiroft

if you think it is possible to have ngboost in skforecast I can help since I need it

amiroft avatar Jun 13 '23 12:06 amiroft

Hello @amiroft,

The main problem is that skforecast expects a regressor that follows the scikit-learn API. (X. y)

As you can see in the error, our fit method internally calls the regressor's fit method:

 self.regressors_[step].fit(
    X = X_train_step,
    y = y_train_step
)

But for NGBoost it should be: (note Y instead of y)

 self.regressors_[step].fit(
    X = X_train_step,
    Y = y_train_step
)

From my point of view, there are three options that can work:

  • Use an equivalent Gradient Boosting model (LGBMRegressor, for example) with the predict_dist method of skforecast. The results should be quite similar.
  • Open an issue in NGBoost asking to implement an alias of the Y parameter to be y and thus fully follow the scikit-learn API.
  • In skforecast, you can create a Fork of the repository and create an if statement, something like:
if isinstance(self.regressor, <InsertNGBoostTYPE>):
   self.regressors_[step].fit(
      X = X_train_step,
      Y = y_train_step
  )
else:
   self.regressors_[step].fit(
      X = X_train_step,
      y = y_train_step
  )

Hope it helps!

JavierEscobarOrtiz avatar Jun 14 '23 05:06 JavierEscobarOrtiz

Thanks for the response actually I did this writing wrapper around ngboost so that it be similar to sklearn with this

import pandas as pd from ngboost import NGBRegressor from sklearn.base import BaseEstimator, RegressorMixin from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error from sklearn.preprocessing import OneHotEncoder from sklearn.impute import SimpleImputer

class NGBoostWrapper(BaseEstimator, RegressorMixin): def init(self, n_estimators=100, learning_rate=0.01, random_state=123): self.model = NGBRegressor(n_estimators=n_estimators, learning_rate=learning_rate, random_state=random_state)

  def fit(self, X, y):
      self.model.fit(X, y)
      return self

  def predict(self, X):
      return self.model.predict(X)

  def set_params(self, **params):
      self.model.set_params(**params)
      return self

  def get_params(self, deep=True):
      return self.model.get_params(deep)

but as you know ngboost is to predict probability not point in ngboost lib .predict method is used to predict point the method there is pred_dist do you have any idea on how to have that method inside skforecast with this wrapper? is it possible?

amiroft avatar Jun 15 '23 13:06 amiroft

Hi @amiroft

The main issue is that the recursive multi-step forecasting process indeed requires point estimates at each step. To address this, we would need to implement two prediction processes. Firstly, we would need to save the predicted parameters at each step. Secondly, we would utilize these predicted parameters to obtain a point estimate that can serve as a predictor for the next step. Integrating this behavior seamlessly with the rest of the skforecast functionalities might prove challenging.

However, we are open to exploring new ideas and potential contributions to overcome this limitation. We welcome any suggestions or contributions that can help enhance the integration of ngboost models within the skforecast framework.

It's worth mentioning that the predict_dist method in skforecast is inspired by the idea of predicting distributions from ngboost. Although the inference process differs, it might be beneficial for you to give it a try and assess its suitability for your specific use case.

Please feel free to share any further thoughts or ideas, and don't hesitate to reach out if you need any further assistance.

JoaquinAmatRodrigo avatar Jun 16 '23 19:06 JoaquinAmatRodrigo

Hi,

If skforecast is scikit learn compatible you might try with an standard boosting algorithm and use MAPIE instead:

https://github.com/scikit-learn-contrib/MAPIE

edgBR avatar Jul 28 '23 12:07 edgBR