darts
darts copied to clipboard
[BUG] RegressionEnsembleModel fails with base estimators that use `output_chunk_shift > 0`
I'm trying to use EnsembleRegressionModel to improve the multi-step predictions. Right now, the two forecasting models I'm using are LightGBMModel and XGBoostModel. Their corresponding model and hyper paramters are listed below:
PRICE_LGBM_PARAMS = {
"lags": [-1, -12, -24],
"lags_past_covariates": 2,
"lags_future_covariates": [1, 3, 6, 9, 12],
"output_chunk_length": 24,
"output_chunk_shift": 13,
"add_encoders": {
"datetime_attribute": {
"future": ["hour"],
"past": ["hour"],
},
"tz": TIME_ZONE,
},
"n_estimators": 361,
"learning_rate": 0.01,
"max_depth": 3,
"num_leaves": 61,
"feature_fraction": 0.96,
"verbose": -1,
}
PRICE_XGB_PARAMS = {
"lags": 1,
"lags_past_covariates": 1,
"lags_future_covariates": [1],
"output_chunk_length": 24,
"output_chunk_shift": 13,
"add_encoders": {
"datetime_attribute": {"past": ["hour"]},
"position": {"past": ["relative"], "future": ["relative"]},
"transformer": Scaler(),
"tz": TIME_ZONE,
},
"multi_models": True,
}
The next thing I did is following:
models = [lgbm_model, xgb_model]
# The two models above have been fitted already
regression_model = LinearRegressionModel(lags=None, lags_past_covariates=None, lags_future_covariates=[0],
multi_models=True, output_chunk_length=24)
ensemble_model = RegressionEnsembleModel(
forecasting_models=models,
regression_model=regression_model,
regression_train_n_points=-1,
train_forecasting_models=False,
train_using_historical_forecasts=True,
)
ensemble_model.fit(
series=processed_data["price_ts"],
past_covariates=processed_data["past_covs"],
future_covariates=processed_data["train_future_covs"],
)
The error message I got is
ValueError: Cannot perform auto-regression (n > output_chunk_length) with a model that uses a shifted output chunk (output_chunk_shift > 0)
I'm wondering how to fix this. Thanks for your help!
Hi @andrew20012656, and thanks for raising this issue. There seems to be a bug in our RegressionEnsembleModel when trying to perform ensembling of models that use output_chunk_shift > 0. I'll add it to our backlog.
For reference, here is a minimal reproducible example:
from darts.datasets import AirPassengersDataset
from darts.models import RegressionEnsembleModel, LinearRegressionModel
m1 = LinearRegressionModel(lags=[-1, -12], output_chunk_shift=13, output_chunk_length=24)
m2 = LinearRegressionModel(lags=[-1], output_chunk_shift=13, output_chunk_length=24)
series = AirPassengersDataset().load()
m1.fit(series)
m2.fit(series)
models = [m1, m2]
regression_model = LinearRegressionModel(
lags_future_covariates=[0],
output_chunk_length=24
)
ensemble_model = RegressionEnsembleModel(
forecasting_models=models,
regression_model=regression_model,
regression_train_n_points=-1,
train_forecasting_models=False,
train_using_historical_forecasts=True,
)
ensemble_model.fit(series=series)
pred = ensemble_model.predict(series=series, n=24)