sktime [BUG] `AutoETS` gives different forecasts on different machines and different OS; for same input data

(shorter MRE by @bbhushan-ds moved here for simplicity)

AutoETS predict is giving different result on different OS and I have checked all my libraries are same. If we run this simple program on Windows/ Ubuntu / Docker (Ubuntu) it produces different result.

To Reproduce

from sktime.datasets import load_airline
from sktime.forecasting.ets import AutoETS

y = load_airline()
forecaster = AutoETS(auto=False, additive_only=True, n_jobs=-1, allow_multiplicative_trend=False,
                random_state=22, trend='add', seasonal='add', sp=12)
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1, 2, 3, 4, 5, 6, 7, 8, 9])

Expected behavior and Difference

Window: 1961-01 451.578434 1961-02 427.259888 1961-03 464.704207 1961-04 500.104578 1961-05 508.663801 1961-06 567.717163 1961-07 647.899372 1961-08 632.464688 1961-09 534.377506

Linux:

1961-01 451.569972 1961-02 427.239419 1961-03 464.659612 1961-04 500.068765 1961-05 508.625644 1961-06 567.676560 1961-07 647.863513 1961-08 632.421329 1961-09 534.330773

If I run with my own erratic dataset it produces more drastic results. I can share data.

Versions

from sktime import show_versions; show_versions()

Window:

Python dependencies: pip: 24.0 setuptools: 65.6.2 sklearn: 0.24.1 sktime: 0.13.2 statsmodels: 0.13.2 numpy: 1.22.4 scipy: 1.8.1 pandas: 1.1.5 matplotlib: 3.6.2 joblib: 1.0.1 numba: 0.59.1 pmdarima: 2.0.2 tsfresh: None

Linux:

/usr/local/lib/python3.9/dist-packages/_distutils_hack/init.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")

System: python: 3.9.19 (main, Apr 6 2024, 17:57:55) [GCC 11.4.0] executable: /usr/bin/python machine: Linux-4.18.0-513.9.1.el8_9.x86_64-x86_64-with-glibc2.35

Python dependencies: pip: 24.0 setuptools: 65.6.2 sklearn: 0.24.1 sktime: 0.13.2 statsmodels: 0.13.2 numpy: 1.22.4 scipy: 1.8.1 pandas: 1.1.5 matplotlib: 3.6.2 joblib: 1.0.1 numba: 0.59.1 pmdarima: 2.0.2 tsfresh: None

Thanks in Advance!

Original bug report by @kumar781

Describe the bug

To Reproduce

from sktime.forecasting.ets import AutoETS
from sktime.utils.plotting import plot_series
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

class TimeSeriesForecast:

    def __init__(self, data):
        self.data = data
        self.train_data = None
        self.val_data = None
        self.model = None

    def split_data(self, train_size):
        """
        Split the time series data into training and validation sets
        """
        train_size = int(len(self.data) * train_size)
        self.train_data = self.data[:train_size]
        self.val_data = self.data[train_size:]

    def train_model(self):
        """
        Train the auto-ETS model on the training data
        """
        self.model = AutoETS(auto=True, sp=12, n_jobs=-1)
        self.model.fit(self.train_data)

    def update_val_data(self):
        """
        Update the validation data to the trained auto-ETS model using update()
        """
        self.model.update(self.val_data)

    def forecast_future(self, periods):
        """
        Forecast future periods using the trained model
        """
        forecast = self.model.predict(fh=np.arange(len(self.val_data), len(self.val_data) + periods))
        return forecast

    def plot_data(self):
        """
        Plot the original data, training data, validation data, and forecast
        """
        plt.figure(figsize=(12, 6))
        plt.plot(self.data, label='Original Data', color='blue')
        plt.plot(self.train_data, label='Training Data', color='green')
        plt.plot(self.val_data, label='Validation Data', color='orange')
        plt.xlabel('Time')
        plt.ylabel('Value')
        plt.title('Time Series Forecasting using Auto-ETS')
        plt.legend()
        plt.show()

if __name__ == "__main__":
    # Example time series data
    data = pd.Series([10, 20, 30, 40, 50, 60, 70, 80, 90, 100,
                      110, 120, 130, 140, 150, 160, 170, 180, 190, 200])

    # Create TimeSeriesForecast object
    ts_forecast = TimeSeriesForecast(data)

    # Split data into training and validation sets
    ts_forecast.split_data(train_size=0.8)

    # Train the auto-ETS model
    ts_forecast.train_model()

    # Update validation data
    ts_forecast.update_val_data()

    # Forecast 12 future periods ahead
    forecast = ts_forecast.forecast_future(periods=12)
    print(f"Forecast for the next 12 periods: {forecast}")

    # Plot the data
    ts_forecast.plot_data()

Expected behavior

Additional context

Versions sktime 0.13.2

Apr 11 '24 19:04 kumar781

Thanks for reporting.

Could you check whether the bug is still present for a more recent sktime version, e.g., 0.28.0?

Apr 11 '24 19:04 fkiraly

Just checked, with latest sktime==0.28.0. I see fluctuations in forecasts on different machines (windows,linux,etc.) I used the same input data across different machines.

Apr 11 '24 21:04 kumar781

I vaguely remember, this might have been reported somewhere already, but I cannot find the issue, if there was one.

Perhaps you remember, @yarnabrina?

Apr 11 '24 21:04 fkiraly

There was a issue (reported by @ngupta23 I think) that auto=True case doesn't always work in probabilistic setup, but don't remember about this.

@kumar781 there is a random state parameter, have you tried with that? Does it lead to different results too?

Can you please share the results you received in different machines, along with installed pandas/statsmodels versions in them?

Apr 12 '24 04:04 yarnabrina

I have tried with random_state=None and random_state=42. But still get variations in results. Ideally this is my model with hyper-params: model = AutoETS(auto=False, additive_only=True, sp=12, trend='add', damped_trend=True, random_state=42, allow_multiplicative_trend=False, bounds={"damping_trend": (0, 0.8)}, error='add', seasonal='add', n_jobs=-1)

Apr 12 '24 04:04 kumar781

Random state None means not specifying any seed, so that has no effect. Can you please share the results from 42 case in different systems?

Also, please share the installed version of statsmodels and pandas in all of these systems.

Apr 12 '24 04:04 yarnabrina

pandas version: 1.1.5 statsmodels version: 0.13.2

I have maintained the above versions in all of the machines

Apr 12 '24 04:04 kumar781

This screenshot shows differences in forecast values in two machines

Apr 12 '24 14:04 kumar781

I have two questions for the script you provided as reproducible example.

The script does not run for me. It fails with ValueError: Cannot compute initial seasonals using heuristic method with less than two full seasonal cycles in the data., which makes sense as you are using 12 as periodicity with training dataset of 16 observations. Can you please confirm that this script is working for you as it is?
I noticed that in prediction you are skipping first few horizons till length of validation data. Can you please explain why is that? Since you are updating the trained model with validation model, the model is up to date till end of validation period and fh=1 should be corresponding to first future data (@fkiraly please correct me if I am wrong here). So I am not following the rationale behind the gap, can you please explain?

Apr 13 '24 06:04 yarnabrina

and fh=1 should be corresponding to first future data (@fkiraly please correct me if I am wrong here)

Yes, confirming that this agrees with my understanding as well.

Apr 13 '24 09:04 fkiraly

Sorry for the confusion, let me make it simple for you.

Here is the reference code: %python from sktime.forecasting.ets import AutoETS from sktime.utils.plotting import plot_series import pandas as pd import numpy as np import matplotlib.pyplot as plt

class TimeSeriesForecast:

def __init__(self, data):
    self.data = data
    self.train_data = None
    self.val_data = None
    self.model = None

def split_data(self, train_size):
    """
    Split the time series data into training and validation sets
    """
    train_size = int(len(self.data) * train_size)
    self.train_data = self.data[:train_size]
    self.val_data = self.data[train_size:]
    print(self.train_data)
    print(self.val_data)

def train_model(self):
    """
    Train the auto-ETS model on the training data
    """
    self.model = AutoETS(auto=False, additive_only=True, sp=12, trend='add', damped_trend=True, random_state=22,
                        allow_multiplicative_trend=False, bounds= {"damping_trend": (0, 0.8)}, error='add',
                        seasonal='add',  n_jobs=-1)
    self.model.fit(self.train_data)

def update_val_data(self):
    """
    Update the validation data to the trained auto-ETS model using update()
    """
    self.model.update(self.val_data)

def forecast_future(self, periods):
    """
    Forecast future periods using the trained model
    """
    forecast = self.model.predict(fh=np.arange(len(self.val_data), len(self.val_data) + periods))
    forecast =[max(0,x) for x in forecast]
    return forecast

def plot_data(self):
    """
    Plot the original data, training data, validation data, and forecast
    """
    plt.figure(figsize=(12, 6))
    plt.plot(self.data, label='Original Data', color='blue')
    plt.plot(self.train_data, label='Training Data', color='green')
    plt.plot(self.val_data, label='Validation Data', color='orange')
    plt.xlabel('Time')
    plt.ylabel('Value')
    plt.title('Time Series Forecasting using Auto-ETS')
    plt.legend()
    plt.show()

data = pd.Series([269, 203, 29, 146, 102, 157, 284, 65, 126,208,32,45,57,391,94,301,94,420,125,8,204,14,31, 21, 12, 153, 102, 15, 13, 295, 15, 0, 48, 6,16])

ts_forecast = TimeSeriesForecast(data)

ts_forecast.split_data(train_size=0.75)

ts_forecast.train_model()

ts_forecast.update_val_data()

forecast = ts_forecast.forecast_future(periods=12) print(f"Forecast for the next 12 periods: {forecast}")

ts_forecast.plot_data()

windows results: Forecast for the next 12 periods: [0, 78.90309073092988, 33.06003200221992, 0, 0, 0, 169.3505577137282, 11.244189654172164, 97.72735644642658, 16.12452980090852, 234.47665032387786, 94.01643707146793]

Linux results: Forecast for the next 12 periods: [0, 76.8419388442535, 26.526535060455377, 0, 0, 21.54934417187846, 168.65641428001874, 6.814078404802547, 91.7463686880784, 11.494843620057866, 237.82799501148597, 87.35389929715916]

Apr 14 '24 04:04 kumar781

Ideally i would want the same forecast values to be reproducible across environments

Apr 14 '24 04:04 kumar781

I have two questions for the script you provided as reproducible example.

The script does not run for me. It fails with ValueError: Cannot compute initial seasonals using heuristic method with less than two full seasonal cycles in the data., which makes sense as you are using 12 as periodicity with training dataset of 16 observations. Can you please confirm that this script is working for you as it is?

I noticed that in prediction you are skipping first few horizons till length of validation data. Can you please explain why is that? Since you are updating the trained model with validation model, the model is up to date till end of validation period and fh=1 should be corresponding to first future data (@fkiraly please correct me if I am wrong here). So I am not following the rationale behind the gap, can you please explain?

I see your point. I modified the code to: from sktime.forecasting.ets import AutoETS from sktime.utils.plotting import plot_series import pandas as pd import numpy as np import matplotlib.pyplot as plt

class TimeSeriesForecast:

def __init__(self, data):
    self.data = data
    self.train_data = None
    self.val_data = None
    self.model = None

def split_data(self, train_size):
    """
    Split the time series data into training and validation sets
    """
    train_size = int(len(self.data) * train_size)
    self.train_data = self.data[:train_size]
    self.val_data = self.data[train_size:]
    print(self.train_data)
    print(self.val_data)

def train_model(self):
    """
    Train the auto-ETS model on the training data
    """
    self.model = AutoETS(auto=False, additive_only=True, sp=12, trend='add', damped_trend=True, random_state=22,
                        allow_multiplicative_trend=False, bounds= {"damping_trend": (0, 0.8)}, error='add',
                        seasonal='add',  n_jobs=-1)
    self.model.fit(self.train_data)

def update_val_data(self):
    """
    Update the validation data to the trained auto-ETS model using update()
    """
    self.model.update(self.val_data)

def forecast_future(self, periods):
    """
    Forecast future periods using the trained model
    """
    fh= np.arange(1, abs(len(self.val_data)- len(self.val_data) + periods)+1)
    print(fh)
    forecast = self.model.predict(fh=fh)
    forecast =[max(0,x) for x in forecast]
    return forecast

def plot_data(self):
    """
    Plot the original data, training data, validation data, and forecast
    """
    plt.figure(figsize=(12, 6))
    plt.plot(self.data, label='Original Data', color='blue')
    plt.plot(self.train_data, label='Training Data', color='green')
    plt.plot(self.val_data, label='Validation Data', color='orange')
    plt.xlabel('Time')
    plt.ylabel('Value')
    plt.title('Time Series Forecasting using Auto-ETS')
    plt.legend()
    plt.show()

data = pd.Series([269, 203, 29, 146, 102, 157, 284, 65, 126,208,32,45,57,391,94,301,94,420,125,8,204,14,31, 21, 12, 153, 102, 15, 13, 295, 15, 0, 48, 6,16])

ts_forecast = TimeSeriesForecast(data)

ts_forecast.split_data(train_size=0.75)

ts_forecast.train_model()

ts_forecast.update_val_data()

forecast = ts_forecast.forecast_future(periods=12) print(f"Forecast for the next 12 periods: {forecast}")

ts_forecast.plot_data()

Windows results: Forecast for the next 12 periods: [0, 0, 169.2314796567867, 11.203511283167728, 97.71346026849024, 16.1197827140299, 234.47502866690598, 94.01588309565041, 0, 78.90309073092988, 33.06003200221992, 0]

Results on docker: Forecast for the next 12 periods: [0, 20.30405202885116, 167.9782638016552, 6.444777053405023, 91.54525773631275, 11.385324348690359, 237.76835394919888, 87.32142048072517, 0, 76.8419388442535, 26.526535060455377, 0]

There are huge fluctuations in forecast values in both environments

Apr 15 '24 16:04 kumar781

@fkiraly @yarnabrina i think i addressed the issues you pointed out previously. With the new code there are still forecast fluctuations.

Apr 16 '24 12:04 kumar781

@yarnabrina, @kumar781, in a new issue here https://github.com/sktime/sktime/issues/6305, @bbhushan-ds has provided a much simpler minimal example, so merged the two issues and moved the example to the top.

@bbhushan-ds, I assume you still have the discrepancies if you use the most recent sktime version, 0.28.0 ?

Apr 16 '24 13:04 fkiraly

Tried with sktime 0.28.0, still see forecast value differences there.

Apr 16 '24 14:04 kumar781

@kumar781 @bbhushan-ds I tried to reproduce this on MacOS and Ubuntu. I get same results in both environments (sktime==0.28.0 and statsmodels==0.14.1), but the values I get do differ from the reported results.

On both machines, this is what I received:

>>>
>>> from sktime.datasets import load_airline
>>> from sktime.forecasting.ets import AutoETS
>>>
>>> y = load_airline()
>>>
>>> forecaster = AutoETS(auto=False, additive_only=True, n_jobs=-1, allow_multiplicative_trend=False, random_state=22, trend='add', seasonal='add', sp=12)
>>>
>>> forecaster.fit(y)
AutoETS(additive_only=True, n_jobs=-1, random_state=22, seasonal='add', sp=12,
        trend='add')
>>>
>>> y_pred = forecaster.predict(fh=[1, 2, 3, 4, 5, 6, 7, 8, 9])
>>>
>>> y_pred
1961-01    451.578339
1961-02    427.259695
1961-03    464.703828
1961-04    500.104254
1961-05    508.663452
1961-06    567.716788
1961-07    647.899023
1961-08    632.464277
1961-09    534.377063
Freq: M, Name: Number of airline passengers, dtype: float64
>>>

Question to @kumar781: if you run the example by @bbhushan-ds, do you get the same results as they reported, or same as mine, or a 3rd version?

Apr 16 '24 17:04 yarnabrina

Also, can you please check if this leads to different results as well or not on different OS?

from sktime.datasets import load_airline
y = load_airline()

from statsmodels.tsa.exponential_smoothing.ets import ETSModel
forecaster = ETSModel(y, trend="add", seasonal="add", seasonal_periods=12)

fitted_forecaster = forecaster.fit()

y_pred = fitted_forecaster.forecast(steps=9)

If these are different on different OS for both of you, I think we can assume it's being introduced by statsmodels itself.

Apr 16 '24 18:04 yarnabrina

1961-01 451.578434 1961-02 427.259888 1961-03 464.704207 1961-04 500.104578 1961-05 508.663801 1961-06 567.717163 1961-07 647.899372 1961-08 632.464688 1961-09 534.377506

Using sktime auto-ets

Apr 16 '24 19:04 kumar781

from sktime.datasets import load_airline
y = load_airline()

from statsmodels.tsa.exponential_smoothing.ets import ETSModel
forecaster = ETSModel(y, trend="add", seasonal="add", seasonal_periods=12)

fitted_forecaster = forecaster.fit()

y_pred = fitted_forecaster.forecast(steps=9

1961-01 451.578434 1961-02 427.259888 1961-03 464.704207 1961-04 500.104578 1961-05 508.663801 1961-06 567.717163 1961-07 647.899372 1961-08 632.464688 1961-09 534.377506

So i get the exact same results with auto-ets from sktime and statsmodel.

Apr 16 '24 19:04 kumar781

However when I run on Linux machine, i get different results: 1961-01 451.578741 1961-02 427.260109 1961-03 464.704148 1961-04 500.104779 1961-05 508.664083 1961-06 567.717524 1961-07 647.899930 1961-08 632.465243 1961-09 534.378124

Apr 16 '24 19:04 kumar781

I infer then that the bug (if it's a bug) is not in sktime and rather it's in statsmodels. The presence of random_state argument in sktime makes it a bit counter-intuitive though, but I think we can document better about scope of that parameter.

FYI @fkiraly wondering if you agree with this?

Apr 16 '24 19:04 yarnabrina

I tried introducing a random_state param in statsmodel adapter get_predictions(), still the forecast values fluctuate across environments. Any other workarounds you have in mind?

Apr 16 '24 19:04 kumar781

FYI @fkiraly wondering if you agree with this?

Yes, I agree with your reasoning and the conclusion.

Also related: https://github.com/sktime/sktime/issues/6274

A potential solution could be to tag the estimator as being stochastic? (see issue)

Apr 16 '24 19:04 fkiraly

The other question would be, is there a statsmodels issue for this? If not, it might make sense to open one.

Apr 16 '24 19:04 fkiraly

We observer the same situation while running the same docker image on multiple aws machines. Were core issues/potential workarounds identified?

Sep 07 '24 13:09 BaklazhenkoNikita

@yarnabrina concluded that the bug is likely occurring in statsmodels, see here: https://github.com/sktime/sktime/issues/6287#issuecomment-2059777833

although no bug report in statsmodels was opened or linked.

@BaklazhenkoNikita, it would be much appreciated if you could search for one and open one if not present. The underlying estimator in statsmodels is ETSModel from statsmodels.tsa.exponential_smoothing.ets.

Sep 07 '24 20:09 fkiraly