darts How are samples generated with RegressionModel / MLPRegressor?

Hi there,

I tried sklearn's MLPRegressor with the RegressionModel wrapper, and to my surprise, I was able to generate samples with historical_forecasts (e.g. with num_samples = 1000). How is this possible? Neither RegressionModel nor MLPRegressor accepts any kind of likelihood or loss function as an argument or am I missing something?

model.supports_probabilistic_prediction is actually false in my case but I can still generate samples.

I would be happy to use this but it is not officially supported, right?

Jul 16 '24 10:07 dwolffram

Hi @dwolffram,

Can you please share a reproducible code snippet? If the model is not probabilistic, it should indeed not be able to generate samples but I might be missing something...

Jul 16 '24 13:07 madtoinou

Hi @madtoinou,

that's what I thought, but somehow I get samples anyway 😅 Or am I doing something wrong?

import matplotlib.pyplot as plt
from darts import concatenate
from darts.datasets import AirPassengersDataset
from sklearn.neural_network import MLPRegressor
from darts.models.forecasting.regression_model import RegressionModel

series = AirPassengersDataset().load()
validation_start = 60

mlp = MLPRegressor(
    hidden_layer_sizes = (8),
    max_iter = 5000
)

model = RegressionModel(
    model = mlp,
    output_chunk_length=4,
    multi_models = True,
    lags = 4
    )

model.supports_probabilistic_prediction # False

model.fit(series)

hfc = model.historical_forecasts(
    series=series,
    start=validation_start,
    forecast_horizon=4,
    stride=4,
    last_points_only=False,
    retrain=False,
    verbose=True,
    num_samples=1000
)

hfc = concatenate(hfc, axis=0)

series.plot()
hfc.plot()
plt.show()

Jul 16 '24 15:07 dwolffram

I just realized that if I set output_chunk_length < 4, I indeed get the error "ValueError: num_samples > 1 is only supported for probabilistic models." No idea if that helps 🤔

Jul 16 '24 15:07 dwolffram

I did a bit of investigation and this is the combination of several things;

the optimized historical forecast method is called because retrain=False and forecast_horizon<=output_chunk_length. This method does not rely on predict() and parallelize all the predictions to speed up things. And in this parallelization, it also duplicate the axes along the num_samples dimension.
if you look at the forecast, all the samples for a given position/time are exactly identical (with output_chunk_length different values).
in the plotting, the quantiles are taken from this repeated values.

An easy way to prevent this would just add some sanity check on the num_samples argument (which is usually taken care of by predict()) in the optimized historical forecast routine. There might be more to it, notably when the predictions are reshaped but it would require further testing.

Jul 17 '24 07:07 madtoinou

Thanks for looking into it!

Jul 17 '24 12:07 dwolffram

Madtoinou, why are the quantiles so wide then? Wouldn't 1000 identical predictions imply zero uncertainty?

Sep 03 '24 20:09 eschibli

Because there are still output_chunk_length different forecasts, due to the erroneous shape of the input tensor.

Sep 04 '24 06:09 madtoinou