sktime icon indicating copy to clipboard operation
sktime copied to clipboard

[BUG] `AutoETS` fails to fit some time series accurately

Open AzulGarza opened this issue 1 year ago • 5 comments

Describe the bug Inspired by this issue, we ran some experiments using the AutoETS model. We found that for some unstable series (with outliers, mainly), the forecast performance of this model degrades by many orders of magnitude compared with other implementations (R and StatsForecast). I believe the problem is related to the statsmodels implementation (https://github.com/statsmodels/statsmodels/issues/8344) (see, for example, the following table using the M4 dataset).

image

As additional preprocessing must be done for this type of unstable series, I think it would be very positive for the users if a warning about the use of the model was included in the documentation, particularly for fitting multiple time series.

Our results suggest that our version is more robust than current alternatives. If you are interested, I could work on including this version in sktime.

AzulGarza avatar Aug 02 '22 18:08 AzulGarza

Sure, interfacing would be appreciated!

I think currently we are probably going to move towards a stance of allowing multiple implementations of the same "modelling idea".

A decision has not been taken yet, if you want to see the discussion and/or weigh in, see here: https://github.com/alan-turing-institute/sktime/pull/3155

statsforecast and your recent addition of ARIMA is one of the more prominent instances related to this discussion, so would be interested to hear your opinion!

Under the proposed change, any number of ARIMA or AutoETS are ok to live in sktime (per interface, or natively), and which one of them is presented as the "default" ARIMA would depend on popularity as well as scientific study results (e.g., the above, reproducible study).

I.e., under the proposed change, sktime core devs would not gatekeep the presence of algorithms anymore as strongly as it is the case under the current model, which follows more the scikit-learn model of a curated selection.

fkiraly avatar Aug 02 '22 18:08 fkiraly

@FedericoGarza I think the data you used has an extreme outlier as first data point. Could you show the results without that outlier?

aiwalter avatar Aug 02 '22 18:08 aiwalter

@fkiraly: Thanks for the answer. We will start working on the implementation. Congrats on this new direction. We are honored to be able to provide more options for the community. :D

AzulGarza avatar Aug 02 '22 19:08 AzulGarza

@aiwalter: You are right. This benchmark dataset proves that the current StatsModels' ETS implementation is not robust, challenging its purpose as a baseline model. R's and StatsForecast 's implementation can fit these series without problems. We believe the root cause is that the current implementation uses an unstable quadratic optimization instead of the Nelder-Mead optimization. We argue that series with outliers are highly present in real-world scenarios and that their robustness is a critical characteristic of baseline models. It would be helpful for Python's forecasting community to know that previous ETS alternatives struggled with these characteristics.

AzulGarza avatar Aug 02 '22 19:08 AzulGarza

It would be helpful for Python's forecasting community to know that previous ETS alternatives struggled with these characteristics.

I hope you are writing a paper where this is being explained? This sounds like a high impact finding.

fkiraly avatar Aug 03 '22 19:08 fkiraly

Awesome decision @fkiraly 👏

valeman avatar Aug 21 '22 15:08 valeman

FYI: A PR has been created in statsmodels that enables multiple optimizers (including Nelder-Mead) for ETS https://github.com/statsmodels/statsmodels/pull/8486

topher-lo avatar Nov 07 '22 10:11 topher-lo