pmdarima
pmdarima copied to clipboard
ARIMA.arima_res_ doesn't store pd.Series name but statsmodels do
Describe the question you have
Hello!
We are creating a wrapper in Skforecast for forecasting using ARIMA models and we are using pmdarima as a dependency.
We are trying to apply the append method from statsmodels in ARIMA().arima_res_
and we are finding different behavior between pmdarima and statsmodels.
Inside ARIMA.arima_res_
there is an attribute that stores the original endogenous data (ARIMA().arima_res_.model.data.orig_endog
). When statsmodels is used, it stores the pd.Series
and its name but when pmdarima is used the name is removed.
As result, when we try to apply the append()
method we get the following error:
ValueError: Columns must match to concatenate along rows.
Reproducible example:
- data:
import pandas as pd
import numpy as np
np.random.seed(123)
y_datetime = pd.Series(data=np.random.rand(50))
y_datetime.name = 'y'
y_datetime.index = pd.date_range(start='2000', periods=50, freq='A')
print(y_datetime.head(5))
last_window_datetime = pd.Series(data=np.random.rand(50))
last_window_datetime.name = 'y'
last_window_datetime.index = pd.date_range(start='2050', periods=50, freq='A')
2000-12-31 0.696469 2001-12-31 0.286139 2002-12-31 0.226851 2003-12-31 0.551315 2004-12-31 0.719469 Freq: A-DEC, Name: y, dtype: float64
- statsmodels: (Here
append()
works)
from statsmodels.tsa.statespace.sarimax import SARIMAX
mod = SARIMAX(endog=y_datetime, order=(1,1,1))
res = mod.fit()
print(res.model.data.orig_endog.head(5))
new_res = res.append(last_window_datetime, refit=False)
2000-12-31 0.696469 2001-12-31 0.286139 2002-12-31 0.226851 2003-12-31 0.551315 2004-12-31 0.719469 Freq: A-DEC, Name: y, dtype: float64
- pmdarima: (Here the Name is deleted and
append()
does not work)
from pmdarima.arima import ARIMA
mod = ARIMA(order=(1,1,1))
mod.fit(y_datetime)
print(mod.arima_res_.model.data.orig_endog.head(5))
mod.arima_res_ = mod.arima_res_.append(last_window_datetime, refit=False)
2000-12-31 0.696469 2001-12-31 0.286139 2002-12-31 0.226851 2003-12-31 0.551315 2004-12-31 0.719469 Freq: A-DEC, dtype: float64
Versions (if necessary)
Session info:
-----
numpy 1.23.5
pandas 1.4.0
pmdarima 2.0.2
pytest 7.1.2
session_info 1.0.0
skforecast 0.7.dev
sklearn 1.1.0
statsmodels 0.13.5
-----
IPython 8.5.0
jupyter_client 7.3.5
jupyter_core 4.11.1
notebook 6.4.12
-----
Python 3.9.13 (main, Aug 25 2022, 23:51:50) [MSC v.1916 64 bit (AMD64)]
Windows-10-10.0.19042-SP0
-----
Session information updated at 2023-01-09 12:08