[ENH] `skforecast` integration for time series hyperparameter tuning
Summary
This PR adds a full integration with skforecast, allowing Hyperactive to optimize hyperparameters of skforecast forecasting models using any of its optimization algorithms.
Implementation Details
SkforecastExperiment (skforecast_forecasting.py)
- Inherits from
BaseExperiment. - Uses
skforecast.model_selection.backtesting_forecasterinside_evaluate()to perform time-series cross-validation for each parameter set. - clones the forecaster and applies new parameters with
set_params()before every evaluation.
SkforecastOptCV (skforecast_opt_cv.py)
-
sklearn-style estimator (inherits from
BaseEstimator). -
Works with
ForecasterRecursiveand other compatible skforecast forecasters. -
fit():
- Builds a
SkforecastExperimentwith user settings (steps,initial_train_size,metric, etc.). - runs Hyperactive’s optimizer to search for the best hyperparameters.
- refits the best forecaster on all available data.
- Builds a
-
predict():
- delegates to
best_forecaster_.predict()for easy forecasting after optimization.
- delegates to
Configuration
- added skforecast as an optional dependency in
pyproject.tomlunder theintegrationsextra.
Verification
- added
skforecast_example.pyshowing a HillClimbing search withForecasterRecursive+RandomForestRegressor. - added unit tests to verify parameter handling, experiment execution, and integration flow.
Closes
Fixes #199
Hi @fkiraly !! Modified the docs string. As you suggested .
added get_test_params() verified all the tests .
tested the pre-commit on changed files
kindly verify this.
Hi @fkiraly !! commited the changes as you suggested
- added
skforecasttosktime-integration - https://github.com/SimonBlanke/Hyperactive/actions/runs/19598614085/job/56149302140?pr=208 Modified CI to avoid disk storage issue in CI build
- cheked
SkforecastExperimentagainst all tests
Hi, I’ve taken a look at the code related to skforecast, and it looks good. Thanks for the work, @Omswastik-11!
Hi @JoaquinAmatRodrigo @fkiraly !!! Thanks 👍. Can you trigger the workflow to see if it works perfectly ?
It needs to be triggered by one of the repository's maintainers.
Hi @JoaquinAmatRodrigo, @SimonBlanke, and @fkiraly — I’d appreciate your suggestions on how to proceed here.
The issue is that skforecast currently does not support Python 3.14, which affects the new integration.
I am a bit unsure about two things:
- Where exactly should
skforecastbe declared? – Only inpyproject.tomlunder optional extras? – Or in the_tags? - When should
test:vmbe enabled or disabled?
What I have done so far
1. Fixes for the test_examples workflow
Example test run: https://github.com/SimonBlanke/Hyperactive/actions/runs/19608701548/job/56518055022?pr=208
I checked the examples/ directory and noticed that many dependencies (e.g., torch, tensorflow) were not actually needed.
Changes:
pyproject.toml
-
Added a new
test_examplesoptional dependency group including:pytest- integrations used in examples (
sklearn,sktime,skforecast) optuna
Makefile
- Added a new target for running example tests.
test.yml
- Updated the
test-examplesjob to install only the required libraries .
2. Handling the Python 3.14 compatibility issue
skforecast fails to install on Python 3.14, so I added conditional version constraints:
skforecast; python_version < "3.14"
This has been added to all optional dependency groups that require skforecast.
Related CI runs: https://github.com/SimonBlanke/Hyperactive/actions/runs/19608701548/job/56518055064?pr=208 https://github.com/SimonBlanke/Hyperactive/actions/runs/19608701548/job/56518055077?pr=208
local testing
Thanks!
The integration works, nice!
The remaining issues relate to dependency isolation, you have to do it in two places:
- the
get_test_paramsfunction, see above- in the tags (I think you also need to add a few others)
Besides this, the "higher is better" property needs to be inferred from the
metricand set as a tag in__init__.@JoaquinAmatRodrigo, is there a programmatic way to do this?
We do not have a programmatic strategy for this. So far, all the regression metrics that Skforecast allows to be passed as a string are intended to be minimised.
"mean_squared_error": mean_squared_error,
"mean_absolute_error": mean_absolute_error,
"mean_absolute_percentage_error": mean_absolute_percentage_error,
"mean_squared_log_error": mean_squared_log_error,
"mean_absolute_scaled_error": mean_absolute_scaled_error,
"root_mean_squared_scaled_error": root_mean_squared_scaled_error,
"median_absolute_error": median_absolute_error,
"symmetric_mean_absolute_percentage_error": symmetric_mean_absolute_percentage_error
If the user passes a custom function as a metric, they need to indicate whether it is a maximisation or minimisation.
Hi @JoaquinAmatRodrigo !! Thanks for the clarification . It means it would be better if we pass the metric in constructor like metric = 'mse' , scoring = 'lower' as default ? where in metric user can pass their own metric functions and change scoring to 'lower ' or 'higher' if they want to minimize or maximize that metric
Looks like a great solution. You might want to double-check the keywords, usually libraries use 'maximize' or 'minimize'. Not sure what is the ones used in this library.
Thanks @JoaquinAmatRodrigo !! In my recent changes I have added a boolean parameter named 'higher_is_better' . https://github.com/SimonBlanke/Hyperactive/pull/208/commits/ca013a4c83ce583a8fa0766dd6416cb7e2953e5c
I checked sktime_forecasting.py they used scoring as parameter for both metrics and higher_or_lower part but as we have one separate parameter for metric so i just added a boolean param .
https://github.com/SimonBlanke/Hyperactive/blob/main/src/hyperactive/experiment/integrations/sktime_forecasting.py#L171
@fkiraly and @SimonBlanke any suggestions ?
Hi @fkiraly !!! Have a look at this . I explained why I did changes https://github.com/SimonBlanke/Hyperactive/pull/208#issuecomment-3585153351
Have a look at this . I explained why I did changes
What you say in the comment you mention is not consistent with the actual changes in the pyproject.
If you are using AI, please watch what it is doing.
Hi @fkiraly Sorry for the issue It seems I forgot to remove the initial changes . The followings
- name: Free Disk Space (Ubuntu) if: runner.os == 'Linux' run: | sudo rm -rf /usr/share/dotnet sudo rm -rf /usr/local/lib/android sudo rm -rf /opt/ghc sudo rm -rf /opt/hostedtoolcache/CodeQL sudo docker image prune --all --force
rest is the same as I written in comments .
as you mentioned about the changes made in pyproject.toml
This was the commit here I made changes .
https://github.com/SimonBlanke/Hyperactive/pull/208/commits/3d92878e05a64569ffc33eb960fded1a3726f18e
Can you pin where I did the mistakes ?
all tests are passing currently btw https://github.com/SimonBlanke/Hyperactive/pull/208/checks?sha=ca013a4c83ce583a8fa0766dd6416cb7e2953e5c
Hi @SimonBlanke !! Can you re run the workflow ? I have reverted the changes and added this which last time solved the storage issue .
refer this https://github.com/hyperactive-project/Hyperactive/actions/runs/19744054270/job/56593197910?pr=208
- name: Free Disk Space (Ubuntu)
if: runner.os == 'Linux'
run: |
sudo rm -rf /usr/share/dotnet
sudo rm -rf /usr/local/lib/android
sudo rm -rf /opt/ghc
sudo rm -rf /opt/hostedtoolcache/CodeQL
sudo docker image prune --all --force