darts icon indicating copy to clipboard operation
darts copied to clipboard

[BUG] RegressionEnsembleModel fails with base estimators that use `output_chunk_shift > 0`

Open andrew20012656 opened this issue 8 months ago • 1 comments

I'm trying to use EnsembleRegressionModel to improve the multi-step predictions. Right now, the two forecasting models I'm using are LightGBMModel and XGBoostModel. Their corresponding model and hyper paramters are listed below:

PRICE_LGBM_PARAMS = {
    "lags": [-1, -12, -24],
    "lags_past_covariates": 2,
    "lags_future_covariates": [1, 3, 6, 9, 12],
    "output_chunk_length": 24,
    "output_chunk_shift": 13,
    "add_encoders": {
        "datetime_attribute": {
            "future": ["hour"],
            "past": ["hour"],
        },
        "tz": TIME_ZONE,
    },
    "n_estimators": 361,
    "learning_rate": 0.01,
    "max_depth": 3,
    "num_leaves": 61,
    "feature_fraction": 0.96,
    "verbose": -1,
}

PRICE_XGB_PARAMS = {
    "lags": 1,
    "lags_past_covariates": 1,
    "lags_future_covariates": [1],
    "output_chunk_length": 24,
    "output_chunk_shift": 13,
    "add_encoders": {
        "datetime_attribute": {"past": ["hour"]},
        "position": {"past": ["relative"], "future": ["relative"]},
        "transformer": Scaler(),
        "tz": TIME_ZONE,
    },
    "multi_models": True,
}

The next thing I did is following:

models = [lgbm_model, xgb_model]
# The two models above have been fitted already
regression_model = LinearRegressionModel(lags=None, lags_past_covariates=None, lags_future_covariates=[0],
                                                 multi_models=True, output_chunk_length=24)

ensemble_model = RegressionEnsembleModel(
    forecasting_models=models,
    regression_model=regression_model,
    regression_train_n_points=-1,
    train_forecasting_models=False,
    train_using_historical_forecasts=True,
      )

ensemble_model.fit(
    series=processed_data["price_ts"],
    past_covariates=processed_data["past_covs"],
    future_covariates=processed_data["train_future_covs"],
)

The error message I got is

ValueError: Cannot perform auto-regression (n > output_chunk_length) with a model that uses a shifted output chunk (output_chunk_shift > 0)

I'm wondering how to fix this. Thanks for your help!

andrew20012656 avatar Apr 15 '25 07:04 andrew20012656

Hi @andrew20012656, and thanks for raising this issue. There seems to be a bug in our RegressionEnsembleModel when trying to perform ensembling of models that use output_chunk_shift > 0. I'll add it to our backlog.

For reference, here is a minimal reproducible example:

from darts.datasets import AirPassengersDataset
from darts.models import RegressionEnsembleModel, LinearRegressionModel

m1 = LinearRegressionModel(lags=[-1, -12], output_chunk_shift=13, output_chunk_length=24)
m2 = LinearRegressionModel(lags=[-1], output_chunk_shift=13, output_chunk_length=24)

series = AirPassengersDataset().load()
m1.fit(series)
m2.fit(series)

models = [m1, m2]

regression_model = LinearRegressionModel(
    lags_future_covariates=[0],
    output_chunk_length=24
)

ensemble_model = RegressionEnsembleModel(
    forecasting_models=models,
    regression_model=regression_model,
    regression_train_n_points=-1,
    train_forecasting_models=False,
    train_using_historical_forecasts=True,
)

ensemble_model.fit(series=series)
pred = ensemble_model.predict(series=series, n=24)

dennisbader avatar Apr 15 '25 18:04 dennisbader