[ENH] Add exogenous variable support to ARIMA
[ENH] Add exogenous variable support to ARIMA
Reference Issues
- Implements #3077.
What does this implement ? Explain your changes.
This PR adds full support for exogenous variables (exog) to ARIMA and AutoARIMA.
Summary of enhancements:
- Added
"capability:exogenous": Truefor bothARIMAandAutoARIMA. - Modified
_fit,_predict, anditerative_forecastinARIMAto:- accept exogenous variables,
- perform an OLS regression of
yon exog, - run ARIMA on the residual series,
- include regression contribution back into predictions,
- support multi-step forecasting with future exog.
- Updated
AutoARIMAso that ifexogis provided, it is correctly forwarded to the wrappedARIMAmodel.
Tests added
test_arima_with_exog_basic_fit_predict
Verifies that ARIMA fits and predicts correctly when exogenous variables are supplied.test_arima_exog_shape_mismatch_raises
Ensures shape mismatch errors are raised for incorrect exog dimensions infitandpredict.test_arima_iterative_forecast_with_exog
Tests multi-step forecasting with future exogenous values.test_arima_no_exog_backward_compatibility
Ensures ARIMA without exog behaves exactly as before (backwards compatibility).
All existing ARIMA tests continue to pass, and the new tests confirm correctness and robustness of the exogenous variable support.
Does your contribution introduce a new dependency? If yes, which one?
No new dependencies.
Any other comments?
- All changes were validated through both the existing ARIMA test suite and dedicated exog tests.
- Closes - #3077
PR checklist
For all contributions
- [ ] I've added myself to the list of contributors.
- [x] The PR title starts with
[ENH].
For new estimators and functions
- [ ] Not applicable
For developers with write access
- [ ] Not applicable.
Thank you for contributing to aeon
I have added the following labels to this PR based on the title: [ enhancement ]. I have added the following labels to this PR based on the changes made: [ forecasting ]. Feel free to change these if they do not properly represent the PR.
The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.
If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.
Don't hesitate to ask questions on the aeon Slack channel if you have any.
PR CI actions
These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.
- [ ] Run
pre-commitchecks for all files - [ ] Run
mypytypecheck tests - [ ] Run all
pytesttests and configurations - [ ] Run all notebook example tests
- [ ] Run numba-disabled
codecovtests - [ ] Stop automatic
pre-commitfixes (always disabled for drafts) - [ ] Disable numba cache loading
- [ ] Regenerate expected results for testing
- [ ] Push an empty commit to re-run CI checks
Hi @satwiksps,
Thanks for working on this issue. I'm not involved in the Aeon development, so I'm not very familiar with the codebase. After looking at your modifications, I have a question regarding input alignment: the input data (y and exog) are assumed to be aligned. If the ARIMA model includes differencing, it’s necessary to drop from exog the same number of initial observations as the differencing applied to y; otherwise, the alignment is lost.
Is this already handled in your implementation?
Is this already handled in your implementation?
Hello @JoaquinAmatRodrigo I believe the alignment is handled correctly in the current implementation.
-
Exogenous variables are used only once, during the initial OLS regression step
y - (X @ beta). This produces a residual series that is the same length as y, so alignment is preserved at this stage. -
Differencing is applied after residualization
np.diff(residual_series, n=d). At this point, the model works entirely on the differenced residuals and does not use exog anymore, so no trimming or shifting ofexogis required after differencing. -
For forecasting (one-step or multi-step), future
exogrows are used only to compute the regression contribution[1, future_exog_row] @ beta. This does not depend on differencing and does not require dropping the firstdrows.
So I think this implementation preserves correct alignment without needing to manually drop the first d rows of exog. I might be wrong. I would appreciate if you could suggest me if any changes I need to do in my implementation if I am missing something. Thank you.